Website Improvements #2: Custom Redirects

Our GO service has been and will continue to be our supported way for maintaining permalinks to resources. By publishing GO links to resources online and in print, you are able to move your resources to new homes (such as a different location in the new site, a blog, or a wiki) and update the go link with the self-service GO management screens.

During the web-makeover project planning it was decided that we need to move forward with a new site architecture (where everything lives) and drop support for the old URLs from previous versions of the site that are 3-15+ years old. Most of the time links can and should be updated at their original locations, but if that is impossible (such as in a print mailing), you can now ensure that the correct link shows up on the main site’s 404 page.

404-with_link_annotated

Steps to add a link for a 404 page on the main site:

  1. Create a nice GO shortcut to the new destination if one doesn’t exist.
    Go to the GOtrol Panel and create a new go shortcut to the new destination URL.
    If a go shortcut for this destination already exists, then you can skip this step.
  2. In the GOtrol Panel, click on the ‘Create’ tab and add an alias for your shortcut from step one. The important thing here is that the alias ‘name’ is the path portion of the URL that is hitting the 404 page after the initial ‘/’.

    For example, if this URL is getting a 404 page:
    http://www.middlebury.edu/area/department/someimportantpage/default.htm
    then the alias name should be:
    area/department/someimportantpage/default.htm

    go_admin-alias

  3. Go back to the 404 page and verify that it now includes the GO link to your resource.
    404-with_link

We still recommend that you update the pages that link to the site to use their new URLs or GO links, but if that is impossible, you now have a work-around to direct users to the appropriate place.

10 thoughts on “Website Improvements #2: Custom Redirects

  1. Michael Roy

    This is a very clever work-around. I wonder if we should consider actually redirecting people to the new page rather than forcing them to click on a link. I know many people like the lovely photo of Adam’s cabin, but at some point, given the number of broken links we have right now, they may be willing to give up the cabin view for a quicker trip to their desired destination.

    Two other points:

    1. Can we look at the logs to see which pages are most in need of GO redirects?
    2. Should we send this out to site owners since they may not be dedicated readers of the LIS blog?

    Nice work!

    — mike

    Reply
  2. Adam Franco Post author

    Mike,

    We did consider auto-redirecting to the new pages, but felt that the downsides of this approach outweighed the benefits at this time.

    On your other two points:

    1. Logs: Until google re-indexes our site, anyone who clicks a search result will hit a 404 page. Until search is working, its hard to know which pages really need a redirect. That said, Chris has been looking at the analytics reports and making a few redirects for URLs that we know will be a problem.

    2. Notifications: I’ve emailed the Helpdesk and the web-makeover project managers list. This update will go out to other editors in a larger email later this week.

    Reply
  3. Michael Roy

    How about as an interim solution we do the redirects until we are properly indexed by google? (I am also curious to hear more about the downsides, although you may not want to post those on the public web.)

    — mike

    Reply
  4. Jason Mittell

    Quick question – will this work for subpages, or just the specific URL. So if there’s a GO redirect for ump/dept/blahblah, will a link to ump/dept/blahblah/subpage need to be configured separately?

    Reply
  5. Adam Franco Post author

    Just specific URLS, so yes, the subpage would need to be configured separately.

    Ideally this shouldn’t be needed all that often, but is in place for when other solutions (such as updating links on the source page) aren’t possible.

    Reply
  6. Ian McBride

    If the site isn’t properly indexed by the start of next week, I will put in 301 redirects at the server level for all the academics departments, major offices, and sports team pages. We’re close to not needing this as you can see the new pages beginning to appear in Google search results already. I would prefer not to have to resort to this since we used that method to redirect all of the ~dept and ~office sites in 2003 and you can still see these paths appearing in search results. It’s preferable, in my opinion, to have this relatively short interim of unease, allow the search bots to recognize sites in their index as unresolved and update their database. And if we are going to do site-level redirection for these areas, I would prefer to do it outside of GO where we have access to regular expression matching and can serve up proper 301 Moved Permanently headers – GO serves up 302s because we want people to keep using the GO shortcuts and to keep the GO database relatively clean.

    The reason that we didn’t carry over the old IA, apart from an efficiency of management issue, is that we are now allowing site owners to control the information architecture of the site. On the last site, I could repoint all links to all office hours pages on all academic sites with a single line of redirection because I was the only person who could change the path name of a department’s subpages. This is no longer the case. While Adam’s fix does allow a bit of self-management for redirection, it’s a bit burdensome if you’re trying to repoint everything in your department’s site, as Jason notes. Additionally, our new namespace collides in some places with older versions of the site making redirection unusable for those areas.

    We have already added a couple server side redirects for addresses that were sent out in print publications within the last month that are now dead, however I want to say that we’ve been very up front about encouraging people to use GO so that wouldn’t happen. I updated over 900 GO shortcuts prior to the site launch so they would continue to work, we told the Project Managers in September that old site addresses would no longer work and that they should use GO whenever sending out publications, and this was repeated at our workshops. Still, if there are legitimate reasons why we need to repoint an address, such as links to your site on business cards, or admissions publications which have already been printed, we will address those concerns.

    If we can be patient and wait until the search engines have reindexed our site we’ll get a site with fewer duplicate results for searches, since there will be fewer paths to content duplicated by redirection, a more flexible information architecture which is now fully controlled by the editors of our site rather than the web development group, and lower overhead on both clients and servers as users go through fewer request hops to get to the information they’re looking for.

    Reply
  7. Ian McBride

    Want to give a small update, though it may be premature. Google’s Webmaster Tools allows you to submit a URL for “reconsideration” or removal from their cache. However, the form that let’s you do this is limited to one URL at a time. Fortunately, Google provides an undocumented RESTful API for this form. Also fortunately, someone wrote a perl script (http://sourceforge.net/projects/urlremove/) that batch submit this form, given a file with a list of URLs. Google’s Webmaster Tools also allows you to download an Excel spreadsheet of all the crawler errors caused by 404s on your site. I took this Excel sheet, copied the list of URLs into a config file for the perl script and started it a-runnin’.

    Right now it has submitted 900 of about 71,000 total 404 errors of our site in Google’s index. They’re all marked as “Pending”. I don’t really know how Google will react to this, but from their documentation they should honor remove requests for pages that no longer resolve. If this works, we should have a ‘clean’ version of our site returned by search requests which will help people find pages that resolve. I’ll keep you all posted.

    Reply
  8. Annie Dolber

    Whenever I get that page, it’s only on for a split second and then disappears, and another page comes up that doesn’t mean much to me. How would the Go link help in this case?
    Annie Dolber

    Reply

Leave a Reply to Kristen Cancel reply

Your email address will not be published. Required fields are marked *