A robots.txt file is a little text file that lives in your site's root folder. It tells the search engine which part of the website to crawl and index and which part not to.
If you make any mistake while editing/customizing it, the search engine
Bots will stop crawling and indexing your site and your site will not be
visible in the search results.
In this article, I will tell you what is Robots.txt file and how to create
a Perfect Robots.txt file for SEO.
Why is The Robots.txt File Website Required?
At the point when search engine Bots come to sites and blogs, they follow
the robots.txt file instruction and crawl the content. But your website
won't have a Robots.txt file, so the search engine Bots will start crawl and
indexing all the content or pages of your site which you don't want to
index.
Search engine Bots search the robots.txt file before indexing any site or webpages. At the point when they don't get any Instructions by Robots.txt
file, they start indexing all webpages or contents of the site.
Note: Robots.txt file is required for these reasons. If we don't give instructions to the search engine Bots through this file, then they index our entire site. Also, you index some data that you didn't want to index.
Advantages Of Robots.txt File
- The search engine tells Bots which part of the website to crawl and index or which not to.
- A particular file, folder, image, pdf, etc. can be prevented from being indexed in the search engine.
- Sometimes search engine spiders crawl your site like a black mamba, which affects your site performance. But you can get rid of this issue by adding crawl-delay to your robots.txt file. However, Googlebot does not obey this command. But you can set the Crawl rate in Google Search Console. This protects your server from being overloaded.
- You can private the entire section of any website.
- Internal search results can prevent pages from appearing in SERPs.
- You can improve your Website SEO by blocks of low-quality pages.
Where is Located Robots.txt File Inside On The Website?
If you are a WordPress user, it resides in your site's root folder. If this file is not found in this location, the search engine bot starts indexing your entire website. Because the search engines don't search your entire site for the bot Robots file.
If you don't know if your site has a robots.txt file? So on the web
search address bar all you simply should type it -
example.com/robots.txt
A text page will open in front of you as you can see in the
screenshot.
This is the robots.txt file of
DigitalParatha. If you do not see any such txt page, then you have to create a
robots.txt file for your site.
Basic Format of Robots.txt File for SEO
The fundamental configuration of the robots.txt file is very simple and
looks like this,
User-agent: [user-agent name]Disallow: [URL or page you don't want to crawl]
These two commands are considered a complete robot file. However, a
robot's file can contain multiple commands of user agents and directives
(disallows, allows, crawl-delays, etc.).
- User-agent: Search Engines are Crawlers / Bots. If you want to give the same instruction to all search engine bots, use the * sign after user-agent: Like - User-agent: *
- Disallow: This prevents files and directories from being indexed.
- Allow: This search engine allows bots to crawl and index your content.
- Crawl-delay: How many seconds the bots have to wait before loading and crawling the page content.
Preventing All Web Spiders from Indexing Websites
User-agent: *Disallow: /
Using this command directly in the robots.txt file can stop all web
crawlers/bots from crawling the website.
All Web Spiders Allowed to Index All Content
User-agent: *Disallow:
This order in the robots.txt file allows all search engine bots to
crawl every one of the pages of your website.
Blocking a Specific Folder for Specific Web Spiders
User-agent: GooglebotDisallow: /example-subfolder/
This command only stops Google spiders from crawling for example-subfolder. But if you want to block all Spiders, then your
robots.txt file will be like this.
User-agent: *Disallow: /example-subfolder/
Preventing a Specific Page (Thank You Page) from Being Indexed
User-agent: *Disallow: /page URL (Thank You Page)
This will stop all spiders from crawling your webpage or blog URL.
But if you want to block Specific Spider, then you write it like
this.
User-agent: BingbotDisallow: /page URL
This command will only stop Bingbot from crawling your page
URL.
How To Add a Sitemap To Robots.txt File and Why it is Important?
There are thousands of search engines in the world and it is not possible
to submit your site to every search engine, but when you add your sitemap
to Robots.txt file, you do not need to submit your site to all search
engines.
However, submitting your site to Google and Bing is important.
What is Sitemap and Robots.txt File?
A sitemap is a list of all the URLs on your website that tells the search engine about all the pages and posts URLs on your website. The sitemap does not improve your search ranking, but it allows your website to crawl better for search engines.
The robots.txt file helps search engines understand which parts of your site to index and which not. When search engine robots visit your site,
they follow a robots.txt file on your site and index the part that you want to be indexed in the search engine.
How To Add a Sitemap to a Robots.txt File?
First, go to the root directory of your site and select the robot.txt
file and add your sitemap URL by clicking on the Edit button.
Now your robots.txt file will look something like this.
Sitemap: http://www.example.com/sitemap.xml
User-agent: *Disallow: /wp-admin/Allow: /wp-admin/admin-ajax.php
Sitemaps can be placed anywhere in the robots.txt file. It does not
matter where you keep it.
How To Add Multiple Sitemap to Robots.txt File?
You can add different URLs for your multiple Sitemap files like
this
Sitemap: http://www.example.com/sitemap_host1.xmlSitemap: http://www.example.com/sitemap_host2.xml
User-agent: *Disallow: /wp-admin/Allow: /wp-admin/admin-ajax.php
In this way, you can manage your sitemap with the help of a robots txt
file.
You can comment on any type of question or suggestion related to this article. If this article has proved helpful for you, then do not forget to
Share it!
No comments:
Post a Comment