Is Google broken?
December 31, 2005 9:51 AM   Subscribe

Why would Google display a lesser number of results when I remove a word from my search?

I've typed the words "itunes", "label" "mechanicals" and "split" (without quotes or commas) into Google and I get 2020 results. When I then remove the word "split" from the search, I get only 227 results. Why would it do this? By broadening my search, shouldn't I get more than 2020 hits.
posted by gfrobe to Computers & Internet (15 answers total) 1 user marked this as a favorite
 
I get 28,000 hits for i/l/m/s and 74,000 for i/l/m
posted by bonaldi at 10:04 AM on December 31, 2005


weird cause i'm getting the same thing as gfrobe.
posted by freudianslipper at 10:19 AM on December 31, 2005


By default Google ANDs the results for each word

about 2,020 for itunes label mechanicals split.

about 78,300,000 for itunes
about 134,000,000 for label
about 493,000 for mechanicals
about 105,000,000 for split

Use OR to see more pages:

about 314,000,000 for itunes OR label OR mechanicals OR split
posted by Lanark at 10:28 AM on December 31, 2005


sorry, I was using mechanical and not mechanicals on my searches. I get the same results.

Lanark: Yes, but the poster is getting fewer hits with fewer search terms. An AND search implies that with each additional term you're viewing a subset of the results. We clearly aren't, in this case, as the set of i/l/m is smaller than i/l/m/s.
posted by bonaldi at 10:30 AM on December 31, 2005


I get roughly the same results as gfrobe (2000, resp. 351). If I skip to the last page of results, though, Google hides all but the first 185, resp. 157, saying the others are "very similar".

Also, it's not quite true that Google simply ANDs the search terms. Order matters. I get 669,000 results for "split itunes" (without quotes) and 1,040,000 for "itunes split".
posted by gleuschk at 10:34 AM on December 31, 2005


Not all four words have to appear on the page, though. Page 'X' can show up if one of the terms occurs frequently in pages linking to 'X', without necessarily showing up on 'X' itself. I don't know if that's enough to account for this discrepancy, but it's worth a thought.
posted by Wolfdog at 10:34 AM on December 31, 2005


Google does say that it "by default" includes all of your statements in a search, which I take to be an implication (as Lanark notes) that it is parsing your query as a bunch of ANDs compounded together.

However, I believe you are right that if it is using strict, shallow Boolean logic there is no explanation for ABC being larger than ABCD.. it should be, at most, equal. Right?

So, we've got to assume they're not telling the full truth about how they handle search terms (big surprise), or how they report the number of responses. Maybe it starts by searching for all your statements as ANDs, and then, when that doesn't satisfy some theshold, it tries again with a different parse.. maybe it throws in some ORs just for fun. *shrug*
posted by Hildago at 10:41 AM on December 31, 2005


Response by poster: That's strange gleuschk because I get the same number of results (1,080,000) on "itunes split" and "split itunes".
posted by gfrobe at 10:45 AM on December 31, 2005


Search engines do not perform a perfect search of your words across an identical domain for all searches. That is, if you were to search for two different phrases, the input search space will be very different. Just how they work. Keywords in anchor text, paid inclusion, spaminess of your request, etc.
posted by kcm at 11:04 AM on December 31, 2005


Google returns nearly twice as many hits for cars milk fish (3,430,000) than the other five rearrangements of these three words (1,870,000). Go figure.
posted by weapons-grade pandemonium at 11:11 AM on December 31, 2005




Assuming that none of you are in Holland, there's a geographical effect as well; when I do the same searches I get:

-itunes label mechanicals split-: 2,060
-itunes label mechanicals-: 2,270
-cars milk fish-: 7,140,000
-milk fish cars -: 3,350,000
-other permutations-: 3,340,000
posted by disso at 12:24 PM on December 31, 2005


The numbers are low enough on this particular search that it probably doesn't matter, but when it says "about 2000" it really means "about". It's an estimate, and by the time you reach the last page it'll usually be a bit smaller than the estimate. Some inconsistencies between searches can be explained by it estimating better.
posted by mendel at 1:41 PM on December 31, 2005


Google is not one thing. Your search getting handled by a different data center or even server could make a difference, especially during the "google dance" (Google's monthly major update of its index).
posted by abcde at 4:28 PM on December 31, 2005


What mendel said. Google doesn't calculate the actual number of pages with a given term or combination of terms, because that would require searching through all of the hundreds of thousands of results to see which contain all the terms. Instead it (I imagine) takes the word count for each term, runs it through a clever algorithm with some other statistics, and throws an essentially random number at you.
posted by cillit bang at 8:39 PM on December 31, 2005


« Older How to buy my first car in Ontario?   |   Failed WMP install leaves me in BSOD/reboot loop Newer »
This thread is closed to new comments.