published google doc - autoindexed or hidden until linked?
August 28, 2009 4:42 AM   Subscribe

Is a published Google Doc document autoindexed by Google or hidden until it is first linked by another page?

I am sure the answer to this is in plain sight somewhere but I've been googling for 30 minutes (!) without finding the answer so I'd better ask for help.
posted by nolnar to Computers & Internet (5 answers total)
 
Google docs are not indexed by Google due to the robots.txt file that prevents it. Source. (Ionut Alex Chitu is a well-known blogger who writes about Google.)
posted by IndigoRain at 12:02 PM on August 28, 2009


Thank you. That's a credible source so I'll take their word for it. But weird that google doesn't explicitly state it on the google docs info pages (as far as I can tell/search).
posted by nolnar at 3:16 PM on August 28, 2009


It's mentioned in their help article on 'Privacy and security':
Because robots and spiders can't get to your documents, spreadsheets or presentations, your docs won't appear in any search index.
posted by chrismear at 10:39 AM on August 30, 2009


The robots.txt file doesn't actually prevent the indexing of Google docs. You can verify this for yourself by going to http://docs.google.com/robots.txt. If you publish a document, it'll have a /View link. However, Google doesn't crawl its own docs as a matter of course. It'll only crawl a document if (a) you've published it so it doesn't require credentials, and (b) linked to it from something that Google does crawl.

On a bit of a tangent, you can configure a Google Search Appliance to crawl Google Docs using credentials.
posted by me & my monkey at 4:16 PM on August 30, 2009 [1 favorite]


Update: they're soon going to be allowing documents that have been explicitly published to be crawled. Here's the post about it.
posted by chrismear at 1:47 PM on September 21, 2009


« Older Let's say I'm a Russian student in my mid-twenties...   |   Collen McCullough's Masters of Rome Newer »
This thread is closed to new comments.