Overview
Boost "provides static page caching for Drupal enabling a very significant performance and scalability boost for sites that receive mostly anonymous traffic". With the move of proxies to the WWW servers (January 2012), it is possible to install and configure it in the Stanford Leland web environment.
Caching represents a tradeoff between performance and fresh content. If your site content changes relatively infrequently, you can configure Boost to keep static HTML cache of your pages and serve that cache up very quickly.
Note: These instructions are for the Drupal 6 versions of Boost. Different versions of the Boost module have different options, so your configuration page might look slightly different than the screenshots below.
Last reviewed/updated: Dec. 2012
Jump to:
Getting Ready
- Check whether your proxy is on the proxy servers or the WWWs
ping myvanityurl.stanford.edu
If the name resolves to "www-lb.stanford.edu" instead of "proxy*.stanford.edu", you're ready to Boost.
- (If it resolves to "proxy*.stanford.edu", you will have to submit a HelpSU request asking that the virtual host be moved from the proxy servers to the WWW servers. See the article on Drupal and Virtual Host Proxies on the WWW Servers for more information.)
- Download and enable boost as you would with any normal Drupal module.
Configuring Boost
Go to admin/settings/performance/boost. The boost configuration screen can be a bit overwhelming, so let's break it down section-by-section. The following settings are simply suggested starting points; you likely will need to tweak the settings based on your site's specific needs.
Some of the options will be grayed out until you enable other dependency options, so you may have to save the settings on this page several times before it duplicates the settings below.
Once you have this page configured properly, you need to go to admin/settings/performance/boost-rules, copy the .htaccess rules, and paste them in your .htaccess file. If you do not do this, Boost will not work.
Boost File Cache
- Static page cache: Enabled
- Gzip page compression: Enabled
- HTML/XML/JSON Default maximum cache lifetime: 8 weeks. You can set this to whatever you like - longer duration means better performance, but potentially stale content. If the site is low traffic with infrequent changes, it makes sense to set this to a high number, so that anonymous users will almost always receive a cached version. This setting can be overridden on a per-page basis.
- Clear ALL and Clear expired buttons: use these to clear the Boost cache
Boost Cacheability Settings
You can check all of the boxes in this section, and choose Cache every page except the listed pages (and leave the section below blank, or enter specific pages to exclude). This is a relatively aggressive configuration, caching all HTML, XML, JSON, CSS, and Javascript files. The only pages it will not cache are those with PHP errors, Drupal messages, or pages that redirect to aliased pages (e.g., the Global Redirect module redirects node/8 to content/foo.)
Boost Cache Expiration / Flush Settings
This section enables you to control what conditions should trigger a clearing of various cached files.
- Clear expired pages on cron runs: Enabled. This setting ensures that cached files are deleted during cron runs, once they've exceeded the specified cache lifetime.
- Check database timestamps for any time changes: Checked. This setting compares the node updated timestamp with the creation date of the cached HTML file, and flushes the HTML file only if the node has been updated in the interim. In my experience this setting was not 100% reliable, leading to stale content, so I usually leave it disabled.
- Clear all empty folders from cache: Checked. On Leland it is safe to leave this checked.
- Clear the front page cache...: Checked. Ensures fresh front page content.
- Clear all cached pages referenced...: Checked. Ensures fresh CCK nodereference content.
- Clear all cached terms pages...: Checked. Ensures fresh content on taxonomy/term/% pages.
- Clear all cached pages in a menu...: Flushes entire menu tree. The options here range from fast/potentially stale to slow/fresh. This only applies when menu items are being saved or modified. Flushing the entire menu tree is chosen as the default here because if you are making changes to menus you usually want them to show right away.
- Clear all cached views pages associated...: Both checked. As with the menu settings, checking these two options are siding with the slow/fresh option.
- Clear Boost's cache when site goes offline: Checked. Leave this unchecked if you want to serve up cached static pages while taking your site offline.
- Flush all sites caches in this database: Unchecked. (Only check if you have a multisite install with a single database.)
- Expire content in DB, do not flush file: Unchecked. As with the database timestamps setting above, I have found this setting not to be 100% reliable, sometimes resulting in stale content.
- Ignore cache flushing: Only ignore clear entire cache commands. This setting means that "drush cc all" will not clear the Boost cache.
Boost Directories
You can safely leave all these settings at their default values.
Boost Advanced Settings
- Preprocess function: None
- Aggressive setting of the boost cookie: Checked
- Asynchronous Operation: Checked
- Overwrite the cached file if it already exists: Unchecked
- Turn off clean urls for logged-in users: Unchecked
- Aggressive Gzip: Unchecked
- Files: Leave empty (not applicable on AFS)
- Directories: Leave empty (not applicable on AFS)
- Watchdog verbose setting: 3. You may want to set this at a higher level when initially configuring boost, then lower it for production.
- Disable warning about reaching the ext3 file system subdir limit: Not applicable.
Boost Crawler
This section configures Boost's internal crawler to go through your site and pre-emptively create cached versions of all pages. It is very useful, especially if you have a low-traffic site.
- Enable the cron crawler: Checked
- Do not flush expired content on cron run...: Unchecked. My experience with checking this setting is that it sometimes results in stale content.
- Preemptive cache HTML, XML, AJAX/JSON: All Checked
- Crawl all URLs in the url_alias table: Checked
- Number of URLs to grab at a time...: 10
- Crawler throttle: 1000000. (1 million microseconds = 1 second)
- Crawler batch size: 10
- Number of threads: 1
- Reset Crawler and Cron Semaphore: If you attempt to run cron (e.g., via "drush cron") and get an error message along the lines of "could not start cron because cron is already running", you may need to hit this button to resolve it.
Boost Apache .htaccess Settings Generation
This section is crucial to getting Boost to work in the Leland environment (specifically the first two settings). If you leave those two set at the defaults, Boost will not work.
- Server's URL or Name: myvanityurl.stanford.edu (the third option)
- Document Root: /afs/ir/group/mygroup/cgi-bin/drupal (the second option)
- ETag Settings: Set FileETag 'All'
- Boost Tags: Set header and tags. This will add a boost-specific header, and an HTML comment at the bottom of the source code like this:
<!-- Page cached by Boost @ 2011-12-06 11:54:45, expires @ 2012-01-31 11:54:45 -->
- Follow RFC2616 14.9.4: Checked
- Ignore .htaccess warning: Unchecked
Clear Boost's Database & File Cache
Click this button to clear all Boost cached data
Editing the .htaccess File
Once you have configured all the settings at admin/settings/performance/boost, you need to go to admin/settings/performance/boost-rules and copy the rules there, then paste them into your .htaccess file. Paste them right before the following line:
# Rewrite URLs of the form 'x' to the form 'index.php?q=x'.
Test It
Now, visit your site as an anonymous (not-logged-in) user. (You may have to load a page twice, once to prime the cache, and a second time to retrieve the cached version.)
View the HTML source of the page and you should see the "page cached by Boost" HTML comment at the bottom of the HTML document. (Note that Boost will not cache pages sent over https, so you must use the http version of your site.)
You can do before and after tests using tools like ApacheBench, YSlow, or Google Page Speed; you will see marked improvements.
Troubleshooting
Boost writes a .htaccess file in its cached files directory (DRUPALROOT/cache/.htaccess). If you get the dreaded "Server Error" message for Boost cached pages, you will need to comment out the three lines at the bottom like so:
# SetHandler Drupal_Security_Do_Not_Remove_See_SA_2006_006 # Options None # Options +FollowSymLinks
Tweak It
Go to admin/build/block and enable the "Boost: Pages cache configuration" and "Boost: Pages cache status" blocks. (Be sure to configure them only to appear for administrator roles.)
These blocks will allow you to configure Boost settings on a per-page basis.
- Maximum Cache Lifetime: Set the cache lifetime to a different time period than the sitewide default
- Preemptive Cache: Set to Yes to allow the cron crawler to cache this page
- Scope: Use these settings for just this page, all pages of this content type, or other options which vary by page type (Views, Panels, etc.)
- Set Configuration: Apply changes (if you've changed the above from Defaults)
- Delete Configuration: If you want to revert settings to the sitewide defaults
- Flush page: Clear the page from the Boost cache, requiring it to be rebuilt the next time it is visited by an anonymous user.