Wildcards in Robots.txt
One of the greatest (mostly) unknown abilities of robots.txt is wildcard pattern matching. We know how robots.txt can block files and directories from being crawled, but in the case of URLs with unique paramaters and duplicate content issues, did you know that Google and Yahoo respect wildcards (this was verified by connections at the engines – but MSN said they do not respect pattern matching “at this time”).
If you have URLs with unique parameters – for example, UTM with Google analytics, paid search tags, and so on – you can create a robots.txt entry like this:
How cool is that? Remember, this only should be employed if you have very unique parameters. If your parameters are keyworded, and that keyword appears as other directories or page names, they will get blocked too… quite possibly to your dismay.
More from Google’s Webmaster Blog.
- How to Use Twitter’s Unreleased “Retweet Yourself” Feature May 25, 2016
- Greenlane’s Podcast Picks May 25, 2016
- 3 Opportunities Pandora Isn’t Taking Advantage of to Leverage Their Big Brand Cred May 17, 2016
- 34 Great Beginner SEO Questions and Answers April 27, 2016