Note to self: try this in ten years
February 1, 2011 9:56 PM   Subscribe

Has Google changed the structure of the web?

Google originally exploited the nature of citations, on the theory that citations were opinions about authority on a subject. It worked quite well obviously, but is there any concrete evidence that the feedback loop from people gaming the system has dramatically changed the structure web?

For example, are websites more miserly with outbound links now? I was reading a book and and the question came to mind; I assume this sort of analysis might take ten years of advances in computers and a read-only copy of the WayBack dataset to answer, but maybe someone's figured out a clever technique?

Note that I'm not as interested in content or the existence of spammers as proof of feedback. Ideally, I'm looking for papers with a timeseries of before and after pagerank.
posted by pwnguin to Technology (10 answers total) 4 users marked this as a favorite
One thing I'm thankful for is that Google upranks search friendly URL's. For example, that's the reasion MetaFilter started adding the "Note-to-self-try-this-in-ten-years" part. Fark finally added it too recently.

Much nicer for the users IMHO, but programmers would be just as happy with numbers or GUIDs if it weren't for Google.
posted by sbutler at 10:01 PM on February 1, 2011 [1 favorite]

I think that Google, and people who were trying to game it, have inspired at least a few broad changes. Such as the


tag for anchors.
posted by Chocolate Pickle at 10:26 PM on February 1, 2011

If you could get huge datasets of the web you could actually analyze this numerically, by looking at the graph properties. But I suspect all you'd find is that the amount of garbage has increased (given some 'garbage detection' algorithm)
posted by delmoi at 10:31 PM on February 1, 2011

The Internet Archive has been crawling the web since 1995, and they make their archive accessible to researchers for analysis. I might look for papers linked to them.
posted by zippy at 10:46 PM on February 1, 2011 [1 favorite]

One thing I'm thankful for is that Google upranks search friendly URL's. For example, that's the reasion MetaFilter started adding the "Note-to-self-try-this-in-ten-years" part. Fark finally added it too recently.

I don't understand this; can you explain what you mean?
posted by meadowlark lime at 11:46 PM on February 1, 2011

I don't understand this; can you explain what you mean?

Because Google takes the actual contents of the URL into (some) account when ranking pages, you'll notice nearly every site will now include actual pertinent details and titles of articles IN THE URL so that instead of /node/2525666, they see something like /177364/Note-to-self.

Ask Metafilter is a poor example because people have fun with the titles, but at blogs and elsewhere, it's nice to get the gist of what the page is you're about to be linked to. And the search engines rank them more highly (though how much impact that has is shifting) if the keywords show up in the URL for the page.

To the core issue of the question, yes and no. Some content farms try to push their outbound links and reinforce their own self-referential Page Rank. But the core premise behind Page Rank is that the authority of the pages linking to you lend you authority, in a way. You could create a self-linking circle jerk of domain names, but unless you get pages of some authority linking in to those, it will have little impact.

Lately, the content farms HAVE been getting more and more authority, and Google has recently enacted changes to their algorithm to help separate the spammy content farms that do a lot of the sort of citation-grubbing link-baiting you describe in their attempt to game the system.
posted by disillusioned at 11:54 PM on February 1, 2011 [1 favorite]

The style of modern blogging is extremely link-heavy -- both internal links to previous posts as well as external links to other blogs or other sources. The superstar bloggers write posts that simply overflow with links. That's part of the ethos of blogging, but you could argue that the situation came to be because bloggers were phenomenally rewarded by the Google ranking algorithm for their love of the link.
posted by Rhomboid at 12:11 AM on February 2, 2011

I'm just going to clarify about the "note-to-self" because I didn't understand it either....

"Note-to-self-try-this-in-ten-years" is the TITLE OF THIS QUESTION, not some feature that Metafilter added to its format that would allow you to flag items and look at them again in 10 years.

So sbutler and disillusioned are saying that Google rates URLs more highly if they contain key words that help define the contents of the URL. And Metafilter is participating in this by putting the TITLE and/or KEYWORDS of each post into the actual URL of that post.
posted by CathyG at 8:07 AM on February 2, 2011

Honestly, I've never figured out the point of giving questions a title; I don't really know what to do with it and usually wind up using it as an aside. Technically, when talking about the that portion of the URL, it's called a slug.
posted by pwnguin at 8:19 AM on February 2, 2011

I believe that dashes instead of underscores in search-friendly URLs are because of Google too.

Most comment spam is also the result of trying to game Google, which is why it's full of keywords and links, and comment spam is usually the reason we need CAPCHAs.
posted by NoraReed at 9:24 AM on February 2, 2011

« Older 2 devices, 1 headset.   |   All worked up, and nothing to show for it Newer »
This thread is closed to new comments.