Big Brother Google Knows Best

October 7, 2009

Matt Cutts recent video on why pages disallowed by Robots.txt still appear in Google’s index is the latest display of Google’s arrogance. When you break it down he is saying that Google doesn’t care what you tell them, they are going to do what they want with your site. To get them to follow your Robots.txt directives you will have to jump through some additional hoops.

big brother

Rationalizing with Technicalities

I know that he is technically correct in that Robots.txt only tells crawlers what not to crawl and Google is not actually crawling the blocked pages. Matt explains that they are not actually “crawling” the blocked pages but instead are simply indexing them without actually seeing them. This brings up another issue related to search quality but I digress. The problem is that most webmasters place pages in their Robots.txt because they don’t want them crawled or indexed. By indexing blocked pages Google is violating the webmaster’s trust and intentions for the site. Google can essentially destroy a business for violating its guidelines but they have no issues violating the guidelines set by webmasters on how robots interact with their site.

Adhering to Google’s Whims

Obviously if you want Google traffic you have to play the game. Some may be willing to go the cat and mouse blackhat route while most will simply try to follow Google’s ever changing rules and whims. That also means that you have to watch out for Google deciding to ignore how you want your site to be accessed. So if you have your Robots.txt set to block certain pages make sure you add the Robots Meta Tag and set it to Noindex so Google does not violate your wishes by indexing blocked pages. Who knows how long they will actually follow the Robots Meta Tag directives but they claim to do it now. That could change tomorrow or even be patently false right now as Mr. Cutts has been known to say one thing when the reality is completely different. And always remember, Google knows better than you and if you just do what they tell you to do everything will be OK.

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • StumbleUpon
  • Technorati
  • Twitter
  • FriendFeed
  • LinkedIn
  • Tumblr

Related posts:

  1. Google SearchWiki…Benign or Nefarious?
  2. Google Interest-Based Advertising & SEO
  3. China v Google Not About Free Speech
  4. Google Kitchen Sink Results
  5. You Want SEO Advice: STFU and Listen

{ 2 comments… read them below or add one }

1 Janky Asphunger October 28, 2009 at 2:06 am

Hi there,
Great approach. Yes,you are exactly right. Yes,i have seen this video. he explained about dmv.ca.gov site. They have blocked few pages with nofollow tag.But still google is indexing those pages…

Even i am facing with same problem. Few of my pages in joomla are not supposed to index.But still google is indexing..and my webmaster tool is showing many errors :( you know how to solve such probs?

2 Mark Pilatowski October 28, 2009 at 3:28 pm

The easiest thing to do is simply apply the
meta tag to the pages you don’t want indexed. As far as the errors I can’t be sure without actually looking at the errors and your site to see what the deal is.

Leave a Comment

Spam protection by WP Captcha-Free

Previous post:

Next post: