How does Google search work?
January 18, 2016 3:00 PM   Subscribe

Specifically. autofill. A Gawker commenter noted that searching "David Vitter prostitute" did not auto-fill until the final letter of 'prostitute' had been entered. That seemed odd to me, so I Googled "Eliot Spitzer prostitute" and "Anthony Weiner sex scandal" both auto-filled. Then I tried "Mark Sanford sex scandal" and "Herman Cain sex scandal" -- nether autofilled until the final letter. "David Vitter diaper" also gave no hint of where this story might be going till the final letter. These search results seem peculiar to me. David Vitter, former Senator and former Governor is surely a high-profile subject than long-forgotten Anthony Weiner or Eliot Spitzer. So my question is, how does that happen?
posted by mmf to Technology (8 answers total) 2 users marked this as a favorite
I don't have specific internal knowledge about this, but the general consensus is that this kind of auto-complete is driven by other people's queries that share the same starting phrase. So reading some sort of partisan motive into it is not generally appropriate. It's true that algorithms and their designers have incredible power over systems like this, but this is not the class of error that I think is typically an issue.

Barring something sinister, the answer is simply that people search more for the Spizter and Weiner stories than the Vitter/Cain scandals. That doesn't seem that surprising to me, since they were far more lascivious and had a lot more visual evidence that makes them high profile.

Take a look at Google Trends for each person. Without even trying to pull out scandal-specific coverage, weiner + spitzer blow the others out of the water. Spitzer peaked higher than Herman Cain's entire campaign.

The other piece to keep in mind is that auto-complete results also vary depending on your own personal search history. So what you see and what other see is by no means guaranteed to be the same, especially if you've been actively researching a topic.
posted by heresiarch at 3:16 PM on January 18, 2016 [6 favorites]

I swear there was a comment on the blue in the last year or two about a parallel phenomenon but I can't find it now. Here's the gist of it: Start typing the name of any female celebrity. Dollars to donuts one of the autocomplete suggestions that appears will be "[name] feet." This is not because there are THAT many people into feet in the world, though there are more than you might think. It's because Google screens certain keywords out of appearing in autocomplete suggestions, and "feet" happens not to be one of them while more obvious ones like "breasts" are.

(However, I just tried this myself and it didn't work the same way. Either they've blacklisted feet now, or it's one of the other variables heresiarch mentions above. Your results will also vary according to your browser and whether you're on a desktop or a phone.)
posted by clavicle at 4:07 PM on January 18, 2016

Spitzer and Weiner are based in the fiercely competitive news hub of New York City. Any hint of a scandal to an ambitious New York politician gets attention from not just the local news outlets but also all the networks as each tries to scoop their competitors. The end result is a lot of online material for Google to index.
posted by plastic_animals at 4:10 PM on January 18, 2016 [2 favorites]

First of all the "didn't autofill until the final letter" is not a thing-- if you had to enter the final letter, it did not autofill.

Google excludes certain words from autocomplete, and also can remove words to avoid defamation. Google doesn't want to appear to be endorsing any particular allegation or theory. Vitter was seeking office as recently as last year, and as I understand has not admitted (or been charged) with certain issues i.e. the diaper thing. Spitzer and Weiner have not sought office for a couple years, and as I understand there is a lot more proven/admitted fact in their scandals. Plus, their scandals were legitimate pop culture events, much more so than Vitter. "Eliot Spitzer prostitution scandal" and "Anthony Weiner sexting scandal" both have their own Wikipedia pages (David Vitter's scandal does not).
posted by acidic at 4:12 PM on January 18, 2016 [4 favorites]

Could it be that there were equal weightings for 'prostitution' and 'prostitute' in this context, and given that they are the same word until the 'e', it didn't know which to propose?
posted by blue_wardrobe at 8:37 PM on January 18, 2016 [1 favorite]

I'd bet good money it boils down to "Search is very hard problem to solve" verses some sort of double standard.

You are seeing two different types of searching going on, "type ahead" search and "full word" search. In one case, the results are meant to be matched on the partial word as you are typing. In the other, the entire word needed to match.

I'm skipping the technical stuff here, but the "strategy" employed by search type is much different. The strengths of one implementation tend to be the weaknesses of the other, and vice-versa. In fact, I wouldn't be surprised is completely different data sets are in play for each type.

I built some stuff for the streaming music product released by everyone's favorite fruit company last year, and I ran into this. If the search term was "Quee", and we were using the full word search endpoint, we got results for bands with "Que" in their names because "Quee" is very close to "Que" if you are thinking about a full word.

However, if we used the type ahead search endpoint, "Quee" matched "Queen", since the next letter was almost certainly an N.

I have no idea how Google decides what type of search to use for what search, but I'm pretty sure that's what is going on.
posted by sideshow at 10:33 PM on January 18, 2016

Antecdata: I knew of the Weiner and Spitzer stories, but not of the Cain, Sanford, nor Vitter stories.
posted by at at 4:02 AM on January 19, 2016

That is not autofill : autofill is when a browser fills in a form for you, like automatically putting your name, address and phone number in the right fields.

What you are describing is most commonly called "autocomplete", although Google's own term for it is "suggest".

Source: I worked on the first browser with autofill, later worked at Google.
posted by w0mbat at 10:55 AM on January 19, 2016

« Older Is there any remaining value on my car's extended...   |   Can we even have a dog? Newer »
This thread is closed to new comments.