How to simulate crawling by search engine bots?


You may choose one of the identification options for our crawler*, which can crawl your website as follows:
  • Standard browser – crawler uses this option by default and is a recommended one. Your website will load the same way your regular visitors see it.
  • Googlebot – this option is used to crawl your website as Google's crawler sees it. Our crawler will be signed in the same way as Google's web search bot (Googlebot/2.1)
  • YandexBot – this option is used to crawl your website as Yandex search bot sees it. Our crawler will be signed in the same way as the main Yandex search bot (YandexBot/3.0)
  • Baiduspider - our crawler will be signed in the same way as the Baidu Web Search Bot.
  • Mysitemapgenerator – use direct identification of our crawler if you need separate control settings and an ability to manage website access.
Please note the ways in which the robots.txt file is processed when choosing different identification methods:
  • When choosing GoogleBot, YandexBot, Baiduspider or Mysitemapgenerator option, only instructions for a particular Bot are considered (User-agent: Googlebot, User-agent: Yandex, User-agent: Baiduspider, User-agent: Mysitemapgenerator – respectively). General instructions (in the User-agent: * section) will be used only if there are no “personal” ones.
  • If you are using Standard browser or Mysitemapgenerator, our crawler will consider only instructions in the Mysitemapgenerator section (User-agent: Mysitemapgenerator), and if it is missing, in the general section (User-agent: *) of the robots.txt file. Any other "personal" sections (such as User-agent: Googlebot or User-agent: Yandex) are not considered.