Get to know robots.txt and how to settings

Most bloggers probably already know what is robots.txt and how the correct settings that have been scattered in cyberspace. I will discuss in detail back in full. What exactly is a robots.txt? is it necessary in settings? What if I leave alone? Maybe there are still many other questions you need to know.

To better understand the sense and the workings of robots.txt, so you better get it,will I make it a matter of questioning will be easier to understand.

What exactly is a robots.txt?

Robot.txt is a command for search engine robots web/blog to trace or not trace page on our blog though. Arguably a robots.txt is a filter blog from search engines.

Does every blog has a robots.txt?

All the blogs already have a robots.txt is given by default by bloggers. By default robots.txt on the blog will look like this:

User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /
Sitemap: http://your_blog_NAME/feeds/posts/default?orderby=UPDATED

To see the default robots.txt, please type with http://your_blog_NAME.blogspot.com/robots.txt

How to understand this code?

User-agent: Mediapartners-Google-Google Adsense Robot crawls our blogs.

Disallow:/off-not allowed.

User-agent: *-All Search Engine Robots and includes a search engine.

Disallow:/search-not allowed to crawl the folder search and beyond, e.g. (/search/label) or (search/search? updated)

Allow:/allow all pages to crawl, unless it is written on a Disallow on top. Sign (/) more or less means that the name of the blog.

Sitemap: http://your_blog_NAME/feeds/posts/default?orderby=UPDATED-SiteMap of our blog feed or address.

How do I prevent robots on a certain page?

To prevent specific pages google crawls on the blog, for example, didn't want to blogabout me page in the index by search engines. For the about me on his blog URL for example: http://your_blog_NAME.blogspot.com/p/contact.html

Then for robots.txt, please copy defaut code above, and add a page that is not allowed, the result will be like this:

User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Disallow: /p/about.html
Allow: /
Sitemap: http://your_blog_NAME/feeds/posts/default?orderby=UPDATED

How to edit robots.txt?

Not so the problem will be fixed, the blog crawled by search engine robots because like I mentioned before, every blog already has a default robots.txt.

Important:

Be careful with the use of robots.txt, if any writing could be ignored by the searchengine blog.

image: www.webseoanalytics.com

Get to know robots.txt and how to settings

No comments:

Post a Comment