Why doesn't my google searches of today have the info from the same search a year ago (even if it's in a different order?)
March 5, 2004 11:33 AM   Subscribe

Why does Google, which purportedly have a mondo cache of everything, "lose" search results over time? That is, if I search on something today that I last searched on a year ago, my results don't include all of last year's results. I understand that there's a complicated ranking algorithm, but that doesn't explain why some results go missing altogether.
posted by blueshammer to Computers & Internet (5 answers total)
 
Google's cache isn't intended to be an archive of everything that ever was on the web. Their cache is only of the most recently spidered version of a page; when they get a new one, the previous one is deleted. So one possibility is that a change has been made to the page so that it no longer matches your search. Also, if they can't access a page, I believe they'll keep the cache for a while, but if it can't be accessed over several tries, it'll be deleted. So it may be that the page is just gone. See also the entry My web pages used to be listed and now they aren't in Google's Information for Webmasters.

It's also possible that the page has been removed from Google--including its cache--at the request of the person in charge of that page. See the entry I need my site information removed.
posted by DevilsAdvocate at 12:39 PM on March 5, 2004


Response by poster: Thanks, DA.
posted by blueshammer at 1:42 PM on March 5, 2004


If you're looking for everything on the web, ever, seek the help of the Wayback Machine. It'll be your new best friend.
posted by bshort at 1:52 PM on March 5, 2004


The most recent issue of Wired has a pretty good explanation of all things google.
posted by drezdn at 2:08 PM on March 5, 2004


Also, there are different servers with different information. So, for instance, when I follow referral logs from Google, I occasionally can't find my site listed in the list of search results. That's because one search server has one iteration of the site and the others have a previous iteration, which did not include the terms that triggered the click.
posted by calwatch at 1:53 AM on March 6, 2004


« Older What is this yellow and blue logo?   |   Web Design Portfolio Newer »
This thread is closed to new comments.