How to simulate crawling by search engine bots?
You may choose one of the identification options for our crawler*, which can crawl your website as follows:
- Standard browser – crawler uses this option by default and is a recommended one. Your website will load the same way your regular visitors see it.
- Googlebot – this option is used to crawl your website as Google's crawler sees it. Our crawler will be signed in the same way as Google's web search bot (Googlebot/2.1)
- YandexBot – this option is used to crawl your website as Yandex search bot sees it. Our crawler will be signed in the same way as the main Yandex search bot (YandexBot/3.0)
- Baiduspider - our crawler will be signed in the same way as the Baidu Web Search Bot.
- Mysitemapgenerator – use direct identification of our crawler if you need separate control settings and an ability to manage website access.
- When choosing GoogleBot, YandexBot, Baiduspider or Mysitemapgenerator option, only instructions for a particular Bot are considered (User-agent: Googlebot, User-agent: Yandex, User-agent: Baiduspider, User-agent: Mysitemapgenerator – respectively). General instructions (in the User-agent: * section) will be used only if there are no “personal” ones.
- If you are using Standard browser or Mysitemapgenerator, our crawler will consider only instructions in the Mysitemapgenerator section (User-agent: Mysitemapgenerator), and if it is missing, in the general section (User-agent: *) of the robots.txt file. Any other "personal" sections (such as User-agent: Googlebot or User-agent: Yandex) are not considered.