Sitemap

How to create a robots.txt file

1 min readJul 28, 2022

File robots.txt defines which crawlers may access on your site. This file is placed at the root of website. Example for website www.google.com, the robots.txt file will be at www.google.com/robots.txt. robots.txt is text file which is written using the Robots Exclusion Standard. It consists of multiple rules, Each rule blocks or allows access for a given crawler to a specified file path in that website.

Example robots.txt File

User-agent: Googlebot
Disallow: /nogooglebot/
User-agent: AdsBot-Google-Mobile
Disallow: /desktop/
User-agent: *
Allow: /

Here’s what that robots.txt file means:

  1. Googlebot is not allowed to crawl any URL that starts with http://google.com/nogooglebot/
  2. AdsBot-Google-Mobile cannot crawl any URL that starts with http://google.com/desktop/
  3. All other user agents are allowed to crawl the entire site.
  4. The default behaviour is that user agents are allowed to crawl the entire site.

Example 2 :

# Example 1: Block only Googlebot
User-agent: Googlebot
Disallow: /

# Example 2: Block Googlebot and Adsbot
User-agent: Googlebot
User-agent: AdsBot-Google
Disallow: /

# Example 3: Block all crawlers except AdsBot (AdsBot crawlers must be named explicitly)
User-agent: *
Disallow: /

Important points to remember :

  • Your site can have only one robots.txt file.
  • Filename must be robots.txt
  • A robots.txt file must be an UTF-8 encoded text file
  • The # character marks the beginning of a comment.

Read more about it

Cheers!!

Sangwin Gawande

About me : https://sangw.in

--

--

No responses yet