What-is-a-robots-txt-file

Robot.txt Files small but Powerful

Posted on Posted in SEO, Web Site Design & Development, WordPress

An important, but sometimes overlooked element of onsite search engine optimisation is the robots.txt file. Usually not weighing more than a few bytes, a poorly configured Robots.txt file can be responsible for creating a lot of indexation and search visibility problems if not implemented properly.

Usually found in your websites root directory the Robots.txt file in simple terms regulate how search engine bots crawl your site. Search engines use this file to determine what content should be crawled and indexed on your site. Simply put this lightweight file grants or denies permission to all or some specific search engine robots to access certain pages or your site as a whole.

So why all the fuss about the robot.txt file?

One of the first things a search engine does is look for the robot.txt file in your root folders on your website. The search engine spider has a look to see if it has permission to access a file. If it has permission it will enter and index that file and then continue to look for other files it has access to.

So why all the fuss?

  • If you don’t use text files correctly, it is going to affect your rankings
  • These files assist Google in discovering and crawling your website and how they interact with your site
  • These files are essential tools needed to help search engines work correctly on your website
  • They are helping Google bots to interact with your website

If you use the robot.txt files correctly you will have success with your website. Your rankings will improve and your business will grow. You can choose not to use them, but remember that Google will crawl your website anyway and wouldn’t you rather have them reading your website correctly…?

What does a Robots.txt file look like?

The format for robots.txt file is actually quite simple, with the typical structure of a robots.txt file looking something like this:

User-agent: *
Disallow:
Sitemap: http://www.yoursite.com/sitemap.xml

So let’s look at this in a bit more detail.

  • The first line usually allows you to control the name of the search bot you are trying to communicate with. For example, Googlebot or Bingbot. You can use asterisk * to instruct all bots.

robots-allow-all

  • The next line sets the rules in place for bots, so they know which parts you want them to index, and which ones you don’t want indexed. For example:
    • Allow: /wp-content/uploads/
    • Disallow: /wp-content/plugins/
    • Disallow: /readme.html
  • The last line tells search engines where your sitemap is located.

How do you determine if your website has a robots.txt file?

Quite simple really!

The robots.txt file should always be located in the same place on any website, so it is easy to determine if a site has one. All you need to do is add “/robots.txt” to the end of your website name as shown below.

www.mywebsitename.co.za/robots.txt

If you have a text file there, it is your robots.txt file. You will find it either has words in it, no words or in worst case scenario no file!

Why is a Robots.txt file important for SEO?

These files are valuable communication links between your site and Google spiders. If you need to submit your XML sitemap to search engines, then you need the text file. The purpose of robots.txt is to instruct the bots in the way in which they must index your content. Using these files will not stop bots from crawling your website, they will go there anyway. Your job is to ensure they go to the right places for SEO reasons.

That’s all for now.

We hope this article helped you to understand the importance of your robot.txt file, should you have any specific questions about it we would be more than willing to assist you in answering them! Simply contact us and one of our team will be in contact.