Image from diTii on flickr
It’s important to understand how WordPress handles the robots.txt file. If you are using WordPress, the robots.txt file is handled virtually, meaning an actual robots.txt file doesn’t reside in your root directory. Yet, if you go to http//www.yourdomain.com/robots.txt, you will see a robots.txt file.
The default WordPress file depends on the plugins you may be running. The default robots.txt file is
User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/
This tells the search engine robots not to index the wp-admin directory and the wp-includes directory.
Search Engine Visibility Setting and Robots.txt file
If you have set the option for search engines not to index this site, you will get this message when searching for your domain name in Google.
“A description for this result is not available because of this site’s robots.txt.”
This occurs when the following option is set in Settings/Reading:
I will often turn on this option when creating a new site. Once the site is ready to go live, don’t forget to deselect this option. Otherwise, the resulting robots.txt file will be:
User-agent: * Disallow: /
This will prevent search engines from indexing your site.
Creating Your Own Robots.txt file
You can override the WordPress virtual robots.txt file by creating your own and placing it in your root directory. If you’re comfortable with a text editor and FTP, this is the best option. Create your own robots.txt file (be sure to save it as robots.txt), then use FTP or your hosts Cpanel to copy it to your root directory. A more complete robots.txt file might be:
User-agent: * Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /wp-content/cache/ Disallow: /wp-content/themes/ Disallow: /wp-login.php Disallow: /wp-register.php
What has been your experience with the robots.txt file?