Don’t Forget Your Trailing Slash
The concept of ending URLs with a trailing slash is not a new one, but many companies I deal with everyday pay little attention to how they structure their URLs when launching new websites.
How do you deal with the trailing slash on your website’s URL?
The Technical Issues
http://www.example.com.au/page
This URL (without the trailing slash) is not telling the server exactly what kind of file to look for, so the server will take an educated guess at the type of document you are trying to locate. The server will begin by checking for a ‘page’ directory and retrieve the default page for this directory if it exists.
If this directory doesn’t exist, it will continue to try to match other options until it finds a matching page or returns a 404 Page Not Found error. This costs you unnecessary server load and will increase load times for your web pages.
http://www.example.com.au/page/
This URL (now with the trailing slash) is telling the server exactly what to look for without the guessing. The server now looks in the ‘page’ directory and will retrieve the default document within that directory. No guessing means the server does it’s job quickly and efficiently.
There are other issues of expandability and security using this method as well that are well worth reading in the this great ALA article on the subject.
The SEO Issues
What many people don’t realise is that these two URLs can also be considered as two different URLs pointing to the same page:
http://www.example.com.au/page/
http://www.example.com.au/page
If you have two instances of virtually the same URL being used on your websites, Google may filter one of them and decide to index the other. More details on this concept of URL canonicalisation can be found in this article by Matt Cutts.
Which URL Google decides to filter may not be up to you and this can split your organic search engine traffic in half if you are not addressing the issue and forcing the search engines to solve this problem on their own.
Your priority here, as with the technical solution above, is to make sure the search engines do not have to guess which URL is correct and to make sure it is absolutely obvious which URL you have chosen as your primary URL.. This can be done in a few ways.
- Internal URLs
- make sure you use the same URL consistently across you website when linking to it from navigation or through internal linking
- External URLs
- you can’t control how people link to your website and chances are they are going to leave the trailing slash off the URL because it’s just easier.
- redirect or rewrite all instances of the URL without the trailing slash to the version with the trailing slash
The Trailing Slash is Your Friend
- If you do not think about your URL structure on your website you could:
- cause unnecessary strain on your server
- serve web pages that take longer to download
- host pages that are harder to find on search engines.
Many people are aware of the problems of URL canonicalisation and take some steps to control it’s effects, but many are still unaware of the problems caused by the trailing slash not being applied consistently across you website.
With a few simple decisions and technical implementations you can easily improve the overall efficiency and findability of your website.
Comments
- Michael Koukoullis says: July 10, 2007 @ 3:16 am
Hi Scott,
Totally agree with the perspective. For all the Apache folk out there, easily rewrite directory referencing URLs to the trailing slash equivalent.
http://planetozh.com/blog/2004/05/apache-and-the-trailing-slash-problem/
http://httpd.apache.org/docs/2.0/misc/rewriteguide.html
Cheers,
Michael Koukoullis
- Tim Lucas says: July 12, 2007 @ 5:59 pm
Ok ok I’ll bite. For someone who develops mostly web apps which don’t use index.html pages or Apache, I can give a slightly difference perspective.
Using /pages instead of /pages/ is perfectly fine, as long as as you link to it consistently and redirect the former one if it’s been mistyped as the latter one.
The real problem arises with having multiple URIs that return the same resource/page is when; you’re not being internally consistent, muddling a search engines view of the your site’s linking structure; and, not redirecting to the correct URI if it’s mistyped, sacrificing precious google juice when users link/bookmark the incorrect URI.
Instead of “always use a trailing slash” I’d say “make sure your use of trailing slashes is consistent” and “ensure there’s only one valid URI for a page/resource, redirecting people when they get it wrong”
- Standardzilla says: July 12, 2007 @ 6:21 pm
@Tim - I agree with the consistent internal usage, one URI per page, etc.
Just out of curiosity, why would you choose /pages over /pages/? Are we talking legacy URIs?
- Tim Lucas says: July 13, 2007 @ 11:29 am
No legacy URIs here. It’s for the same reason you call your documents folder “Documents” rather than “Document”, it represents a collection of resources. If it’s a single resource, such as “about-me” then the singular form obviously makes more sense.
All the REST talk and dev I’ve done has changed the way I look at URIs a little.
If you were to go to your flickr account, would you think of typing http://flickr.com/photos/toolmantim/ or just http://flickr.com/photos/toolmantim? The trailing slash in the former is a little redundant, and if most people are getting it wrong and you’re always doing redirects then maybe that’s the best place for it.
- Tim Lucas says: July 13, 2007 @ 12:51 pm
ok now toolmantim.com is walking the talk, and here’s the code for those who’re interested.
- Standardzilla says: July 13, 2007 @ 6:49 pm
Here’s a thought.
I wouldn’t actually type http://flickr.com/photos/toolmantim with or without the slash. What I would do is type http://www.flickr.com, then find my contacts links and click on the appropriate link to get to your page.
So this would mean depending on how people navigate or find your site (eg. type in the URL vs. clicking internal linking structure), this is how you decide between the trailing slash or not.
Which brings us back to simply being consistent about how you redirect and link across the site (for SEO/link love purposes).
But then what about the difference of having the trailing slash with regards to server load and download times? Is this even worth worrying over? I would love to see some numbers on this one.
This is turning into a bit of a beer conversation :-)
- Jermayn Parker says: July 16, 2007 @ 11:42 am
I have always wondered what the / was all about and if there was a difference. In WP you can choose whether you want it or not.
Would it just be easier to not use it so then when people link to it without the / it would stay the same?
- Nate Klaiber says: July 18, 2007 @ 1:23 am
Finally someone has mentioned the benefit of one point of entry and the trailing slash. Same is true for www or non-www, you need to stick to a standard. For analytics especially, that can potentially view one page as 4 different pages (www, non-www, page, directory, etc). You want to make sure everything works as planned, regardless of user input or links.
It usually took some planning with mod_rewrite - but I would always find the instance of the url without the trailing slash, then do a 301 redirect to the instance with the trailing slash. Thereby preserving all links from others that may/may not use a trailing slash - as well as keeping your analytics/logs neat and tidy.
- What is Apache? says: November 24, 2007 @ 8:02 am
Why does this need to be configured in WP? Why can’t Apache automatically redirect instead of serving the page?
And even if we set up Apache to redirect, how does that save resources? The server will then need to serve two requests instead of one, right?
