As I said at the start of this series, I aim to do at least one thing each week that improves our website for someone. Last week we had a number of improvements to the performance of the site that had a dramatic effect for everyone. Not every update to the site is quite that exciting. This update might not seem as significant, but it will help out people editing our site.

I also wanted to use this series to give more of a back-end explanation of what goes into these changes. This gives other schools and organizations running Drupal an opportunity to see what we’re up to, use our solutions to fix similar problems, and offer suggestions on how we could do this even better. I don’t want these posts to just be, “Yup, I added a button. Problem solved.” *dusts hands off*

But I do realize that those details might bore some people. So, if you want to know what’s changed:

  • You can click Preview Live Site in the Edit Console to hide all of the Edit, Delete, etc. links, hidden menu items and other content available only to editors.
  • There is a Preview button at the bottom of the editing form. Click it to see what your updates will look like before saving the node.

Preview Live Site

I’ve added a checkbox to the Edit Console labeled Preview Live Site. Clicking this will show or hide all of the links and content that are only visible to editors. These links are necessary to edit the site, but sometimes you want to be able to browse the site as a normal visitor would see it, so you can make sure that padding around images is correct and unpublished content isn’t being shown.

Here is the Library web site with all the editing links shown:

library_edit

And here it is in Preview Live Site mode:

library_live

In order to give you the option to browse around the site with this option either off or on, I set a cookie when you click on the checkbox. A browser cookie is a text file your browser creates on your machine and sends back to the site that created it whenever you visit the site. In the case of this browser cookie, you tell Middlebury that the value of “midd_live_preview” is “preview”. As long as your browser retains that cookie, you’ll see the site in “live mode”. Unchecking the checkbox clears the cookie.

This requires the jQuery Cookie plugin. I added a checkbox with the id “livepreview” to our “Edit Console”, which is a floating tab of options for editing the site using Monster Menus. And this is the jQuery-enabled JavaScript that makes this work:

$(function() { // on DOM ready
  $('#livepreview').click(function() {
    var options = { path: '/', expires: 10 };
    if ($('#livepreview').is(':checked')) {
      $.cookie("midd_live_preview","preview", options);
      $('.mm-block-links,div.links,.hidden-cat,.recycle-bin,.preview').hide();
    } else {
      $.cookie("midd_live_preview", null, options);
      $('.mm-block-links,div.links,.hidden-cat,.recycle-bin,.preview').show();
    }
  });
  if ($.cookie("midd_live_preview") == "preview") {
    $('#livepreview').attr('checked', true);
    $('.mm-block-links,div.links,.hidden-cat,.recycle-bin,.preview').hide();
  } else {
    $('#livepreview').attr('checked', false);
    $('.mm-block-links,div.links,.hidden-cat,.recycle-bin,.preview').show();
  }
});

Preview Button for Editors

I’ve added a button next to the Save button at the bottom of the node edit form for you to preview the change. Actually “added” is the wrong word, since the Preview button is always supposed to be there. There must have been issues with an earlier version of the editor that caused our colleagues at Amherst to hide this option, since they wrote:

mm_ui.inc:2335:    // TinyMCE screws up the body in previews, so remove this button for now
mm_ui.inc:2336:    unset($form[‘buttons’][‘preview’]);

I was not able to detect any issues with the preview option, so I added this back in. Be sure to let me know if you notice anything awry. We will be moving to a newer version of TinyMCE, which is the WYSIWYG (What-you-see-is-what-you-get) editor we use very soon. This will accompany the addition of the WYSIWYG Drupal module that will let you choose whether you want to use TinyMCE or the FCKEditor. Stay tuned for more on this in a future update.

However, the normal Drupal preview mode shows you both the “teaser” and full versions of the node you’re posting. We use teaser versions in very few places on our site, making this potentially confusing to editors. Fortunately, Drupal lets me override the output of content through its themeing system and there’s a theme function for the preview mode of nodes. I added this quick function to our template:

function midd_node_preview($node) {
  $output = "";
  if ($node->body) {
    $output .= node_view($node, 0, FALSE, 0);
  }

  return $output;
}

The node_view function takes the node as its first argument, whether to display the teaser version as the second, whether to display the node as its own page as the third, and whether to display editing links as the fourth. We just tell it to show the node, as it would be displayed to a site visitor and be done with it.

Website Improvements #3: Better Performance [Extended Edition]

Here are some additional notes about the update Adam gave last week on how we were able to improve sitewide.

Attack of the Search Bots

Adam discovered that there was a page on our site that was displaying a linked tree of the permission groups for the site – all of them. A couple search spiders found this page and started browsing its sub-pages. There are hundreds of thousands of permissions groups, up to four for every course taught at Middlebury going back years, mailing lists, etc. It’s important to note right here that only the name of the permission group is displayed through this interface, not its members. Still, having search bots crawling all of these pages slowed our site down to a crawl, and there’s no reason we’d want this content indexed anyhow.

Adam added a rule to the robots.txt file for our site telling search bots to ignore this path, which will also remove it from their indexes. He also placed the pages behind authentication so that other users wouldn’t slow the site down by looking through there.

Reducing Hits on the Database

We also began to look at how many requests to the database were being made each time a page on the site was loaded. Pages like the Student Life home page require over 500 database queries to run before they are loaded. Most of this overhead is fetching various pieces of information about menu items, display settings, permissions, and the many fields that some nodes on our site use. There were two things that stood out in these results.

1. There were many queries related to the Workflow and Locale modules. Workflow allows us to create edit -> review -> publish workflows for content approval, but we haven’t set any of these up yet. Still, just having the module enabled requires Drupal to check the database for each node to see what its current workflow state is. The Locale module lets us provide multi-language versions of content and display a version in the local language of a person visiting the site. Since none of our content has been translated, this isn’t useful for our site. I disabled both modules as one step in improving site performance.

2. The left-hand menu requires many queries to load. On the home page, even though we aren’t displaying the menu, it was still loading all sub-pages of the home page, just in case we did want to display them. I hid all sub-pages of the home page in its menu. This reduced the number of database requests to load the home page from 100 to around 30. The home page is the page most often requested on our site, so making it as fast as possible improves performance site-wide.

Moving WordPress

We had originally hoped that we could set up a “high availability” MySQL server that would provide database space for Drupal, WordPress, MediaWiki and other applications we consider part of our “core” supported web applications. This desire came up last Spring when an update to the main database server caused an issue in a little-used, old, third-party application to spiral out of control and disrupt services on several of these high-use applications.

Unfortunately, it appears that the combination of WordPressMU and the Drupal instances of both www.middlebury.edu and www.miis.edu generate so much database activity that MySQL couldn’t keep cached versions of these queries in its memory. Cached queries are really important for performance. When you make a request to the database like “give me all of the stories on the Midd home page” it will read from the server’s disks to find the answer, then keep a copy of the answer in active memory. The next time the database is asked that question, it can skip the step where it looks up the information on the disk. Retrieving information from active memory is several orders of magnitude faster than reading from disk, but it’s a much more limited resource.

With both applications running on the same database server, the query cache would quickly fill up, overflow and empty out, meaning that most requests to the database were being served from the disk. After Adam and Mark worded to get Drupal on its own database server, without WordPress, over 90% of the queries sent to the database are served from the cache, greatly improving the responsiveness of the machine.

Removing Locks

On Thursday, February 11 at 10:07AM, the site crashed and was down for about 15 minutes. The database had stopped processing new requests and needed to be restarted. Before doing that, we looked at the last database query in the queue. It was a request to create a new user group in the groups table. This happens whenever you save a page after adding a single user to the page permissions. To simplify permissions requests, Monster Menus groups all single users assigned to a page together and creates a pseudo-group in the database. So if you add me, Adam and Mark to be able to edit a page, Monster Menus will create a group with each of our user objects in it and assign that group permission to edit the page, rather than each of us individually.

In order to keep these separate from the rest of the groups, they are inserted into the groups table with a negative ID. So a group with a positive ID will be something like “All LIS Staff”, but a group with a negative ID might be “temp group of Ian, Adam and Mark”. The database engine is designed to handle the case where two positive ID group creation requests occur at the same time, but not two negative ID group creation requests. In that case, which request gets assigned which ID?

This problem can be solved by placing what is called a “lock” on the database table until the current request is done processing:

db_query(’LOCK TABLES {mm_group} WRITE’);
$gid = db_result(db_query(’SELECT LEAST(-1, MIN(gid)-1) FROM {mm_group}’));
if (!$gid) $gid = -1;
db_query(’INSERT INTO {mm_group} (gid, uid) VALUES(%d, %d)’, $gid, $uid);
db_query(’UNLOCK TABLES’);

This says, “keep other requests out of the group table, give me the next lowest ID# from this table (in other words – the most negative ID#), create my new group, then let other requests have access”. The problem with this is: if the database fails to process the INSERT request for whatever reason, no other requests can access the table, new requests will pile up and the server will die.

There is a setting on the database that prevents this from happening called table_lock_wait_timeout. By default, this was set to a very high value, somewhere around 36,000 seconds or 10 minutes. We changed this to 30 seconds, which should give the server enough time to process the request or, if it can’t, let someone else have a chance.

The Access Denied page

The path to log into the site is http://www.middlebury.edu/cas (the last part is “cas” because we’re using the Central Authentication System to provide single-sign-on to Drupal, Segue, WordPress, and MediaWiki). I had put the path to Drupal’s 403 (Access Denied) page in the configuration as this path, figuring that if people hit a page they could not view, it would direct them to the sign on page. However, because of the way Drupal draws its access denied page, it was actually bouncing people infinitely between the sign on page and the page to which they didn’t have access and not giving them a chance to sign on.

Several people had this happen to them, waited patiently – too patiently – for it to be resolved and caused a large number of requests to be generated against the site, decreasing site performance. Adam noticed this and set up an appropriate Access Denied page that resolves this issue.

That’s it for this week

If you have any questions, we’re always happy to answer them here and remember that we’re still taking feedback via the Web Feedback form. If I haven’t responded to your question through that form, let me know.