Have you ever searched for something in Google and received a result that looked like this:
This particular website has set up a robots.txt file that is not allowing web crawlers to access information on their site for indexing. It’s possible they did it intentionally, but far more likely that it was an oversight. For websites built in WordPress, it’s as simple as unchecking a box. (Note: the website in the above example is a WordPress site, and I did reach out to them to let them know).
Where Can I Check My Robots.txt File?
One easy place to check and test your robots.txt is through Google Webmaster Tools. If you’re not sure what that is, or how to set it up, learn more about it here. From the dashboard, the click flow is:
Web Property > Crawl > Robots.txt Tester
It will display your current robots.txt file, and allow you to simulate how various Googlebots would crawl your URLs (only Googlebots).
It’s important to note that a robots.txt file is merely a directive; it’s up to the individual bots to honor the request. There are certainly instances that a website would want to block certain content from being indexed. If you want to be completely hidden from the web, there are further steps you’d have to take.
Additionally, the robots.txt file is a public file, so if someone with malicious intentions wanted to, they could see which folders a website was trying to hide.
For most websites, there isn’t a benefit to blocking an entire site from web crawlers. In the search example above, I typed a branded phrase that I would estimate drives over 90% of their traffic. When I see something like “a description is not available because of this site’s robots.txt file” – it looks a little suspicious. Maybe I don’t click on the link. Think of the lost traffic as a result!
It’s often a simple fix. For sites built in WordPress, here’s how you can resolve it. From the dashboard:
Settings > Reading > Search Engine Visibility
Make sure the box that says “Discourage search engines from indexing this site” is unchecked. Some developers may leave this box checked during the development process to discourage indexing of a test site.
And remember – this goes for your mobile site as well!