Archive for November 30th, 2006

Microsoft explains how to verify MSNbot

Microsoft has added some information, on how to identify the different types of their MSNbot Crawlers, on the Live Search’s WebLog. The different types of the MSNbot are:

MSNBot
Main web crawler (www.live.com)

MSNBot-Media
Images & all other media (images.live.com)

MSNBot-NewsBlogs
News and blogs (search.live.com/news)

MSNBot-Products
Products & shopping (products.live.com)

MSNBot-Academic
Academic search (academic.live.com)

They also explain a method, to verify the identy of the MSNbot. Might be useful, since a lot of spam bots are cloaking themself as searchengine crawlers.

  1. When you get a page view request, it specifies a user-agent and an IP address. As I described above, all requests from Live Search use a user agent starting with the word ‘MSNBot’.
  2. If you see the MSNBot user-agent, it’s time to check the identity of the bot. Starting with the IP address (i.e. 207.46.98.149), you can use reverse DNS lookup to find out the registered name of the machine.
  3. Once you have the host name (in this case, livebot-207-46-98-149.search.live.com), you can check that it really is coming from Live Search. The name of all live search crawlers will end with ‘search.live.com’. If the name doesn’t end with ‘search.live.com’, you know it’s not really our crawler.
  4. Finally, you need to verify that the name is accurate. In order to do this, you can use Forward DNS to see the IP address associated with the host name. This should match the IP address you used in Step 2 – if it doesn’t, it means the name was fake.

Add comment November 30th, 2006


Calendar

November 2006
M T W T F S S
« Oct   Dec »
 12345
6789101112
13141516171819
20212223242526
27282930  

Posts by Month

Posts by Category

Friends


Add to Google

secured loans
Tired of worrying about money? Get fast secured loans at All About Loans.

cheap ink
Get cheap ink for your printer online! Our fully compatible ink cartridges are fully guaranteed, high quality and longer lasting! See for yourself!