What is the point of PageRank, and why do people think that's a silly question?
May 31, 2006 7:39 PM   Subscribe

What is the point of PageRank, and why do people think that's a silly question?

What on earth is the point of the "PageRank" reading given by the Google Toolbar?

Please don't explain to me what PageRank is, or theorise about how it's calculated. I know all of that, or the broad strokes anyway.

My question is, out of context what can it possibly mean? If I search for "foo", then all the sites matching "foo" are ranked according to their PageRank, yes. Their foo-rank is used to sort them.

But what does it mean that, absent all context:
  • Microsoft and BoingBoing are both nine
  • Shakespeare dot com and NASA dot gov are both seven
  • a page about chocolate chip cookies is a six?
Obviously, if one's looking for Microsoft information, Microsoft is a ten -- but if you're looking to bake some cookies, then it's pretty much a zero.

But there the PageRank is, and when I ask "how can there be a PageRank without a context?" people seem to think it's not a sensible question.

The little tooltip says it signifies a page's "Importance" on the internet.

So that cookie page is two-thirds as "important" to the internet as Microsoft? And somebody's movie fan site can be as "important" as the IMDB listing for that movie?
posted by AmbroseChapel to Computers & Internet (30 answers total) 1 user marked this as a favorite
 
PageRank doesn't have any great value out of context, but why does that matter? You might as well ask what being great at basketball is good for, off the court—it's not, but that doesn't make it meaningless.
posted by cortex at 7:52 PM on May 31, 2006


(Though whether one should be that impressed by someone who points out that, were they on the court, they would be Kicking Your Ass is another question. So too, PageRank probably won't get anyone and dates.)
posted by cortex at 7:52 PM on May 31, 2006


You have to look at this in aggregate. In specific cases there are probably plenty of examples where a low PR site has better information about a specific topic than a high PR site. But when you start looking at the higher picture, on average, everything else being equal, a high PR site will have better information on 'foo' than a lower PR site.
posted by Rhomboid at 7:57 PM on May 31, 2006


It has to do with how much relevance is accorded a link from a page. Essentially it's a meta-rank.
So a link from boingboing to your site that mentions 'moops' is more important than a link from a lower ranked site when google calculates the results for a search for 'moops'.
posted by atrazine at 7:59 PM on May 31, 2006


When the toolbar first came out years ago, Google was still run by the techies with no one to tell them no, so they added all the shit they could think of to it. Now they probably wouldn't include it.
posted by smackfu at 7:59 PM on May 31, 2006


The main value of PageRank on the Google toolbar, as far as I can tell, is that website owners can see their PageRank just by visiting their own websites.
posted by evariste at 8:42 PM on May 31, 2006


Best answer: So the "context" here is that "this is my website, and my PageRank has gone up/down one".
posted by evariste at 8:43 PM on May 31, 2006


Best answer: You can also visit the competition and see how they're doing by comparison, I guess.

I use Opera so I don't have access to the Google toolbar, but that's how I've read on SEO (search engine optimisation) forums that website developers seem to view it. A useful feature if you own a website, not much use to the casual web surfer.
posted by evariste at 8:44 PM on May 31, 2006


Response by poster: a high PR site will have better information on 'foo' than a lower PR site

This, I'm afraid, falls under missing the point (there's NO FOO!) as does this:

a link from boingboing to your site that mentions 'moops' is more important than a link from a lower ranked site when google calculates the results for a search for 'moops'

because there's NO MOOPS!
posted by AmbroseChapel at 9:16 PM on May 31, 2006


If you add the tags NOFOO NOMOOPS to this question you will be my hero.
posted by evariste at 9:26 PM on May 31, 2006


I think you missed my point. I wasn't trying to say that you should try to apply the comparison to any particular term, I meant that in aggregate the higher PR sites tend to have better results. Of course there's no foo, as foo represents any search term.
posted by Rhomboid at 9:34 PM on May 31, 2006


In a way, you can think of PageRank as a sort of trustyness level for a website. While there can be false information on a high PageRank site, and totally true and useful information on a low PageRank site, a site's ranking gives you a rough idea of how well regarded a site is by the 'net community, as measured by how often people tend to link to it. In the same way that scientific journals are measured by how regularly they are cited, PageRank measures web pages by who often they are linked to.

What the tooltip says is absolutely correct: PageRank is a measure of a page's importance to the internet. An important scientific journal is one that is cited a lot in other publications. An important web page is one that is linked to a lot by other web pages. That's what importance means in this case. You might not find that chocolate chip cookie page important, but consider that it's been around since 1995, has had over a million hits, and is clearly an important resource on the 'net for such cookie information.

Now you may find that page unimportant if you're searching for information about, say, pan-African cooperation. This, however, is a matter of relevance, which is not what PageRank concerns itself with. Relevance is how much a page relates to a given topic or query, while PageRank measures importance, or how well regarded a particular page is on the net.

Hope this helps make things more clear. Feel free to respond with more questions.
posted by zachlipton at 9:39 PM on May 31, 2006


I don't work for Google so I don't know exactly what Page Rank means, but the numbers probably mean that out of sites whose content match your search term, X site is ranked Y for being linked to by important sites. So Shakespeare.com is seventh among other sites with similar content (content about Shakespeare). It is seventh in terms of being linked to by important sites. So maybe Yahoo Directory links to Shakespeare.com, so that's an important link which gives it a high Page Rank like seven. Its Page Rank isn't better than seven, because there are six other sites ahead of it which are also about Shakespeare but get links from Yahoo Directory and even more big-time sites like Wikipedia and even universities.

In other words, the more big-time sites link to a site, the higher its Page Rank. Page Rank is in comparison to other sites with similar content, I think. I got this from skimming a 1998 article by Sergei Brin and Larry Page. But I have only looked at it quickly.

I'm pretty sure that it doesn't have to do with the content of the link itself. A lot of links just say "here" or "interesting," after all. Also I don't think Google is necessarily saying there is better information at Shakespeare.com than at a Shakespeare site ranked eight. I think they're figuring that site seven is more important than site eight. How? In that it gets links from more important sites.

It's a relative measure--they get all the links first and go around a few times readjusting levels of importance.

Zachlipton, you're right, although it's not just the number of links, but one step further--the number of links from sites with high Page Ranks.
posted by halonine at 9:43 PM on May 31, 2006


halonine: of course, and there are lots of other (secret in some cases) factors that influence a site's PageRank. The general simplification still stands though...

Also, with the toolbar, higher PR is better, so having a ranking of seven does not mean that there are six sites ahead of you in your category.
posted by zachlipton at 9:50 PM on May 31, 2006


Response by poster: If you add the tags NOFOO NOMOOPS to this question you will be my hero.

That would just be stupid.

And for the third time now, I don't need people explaining to me what PageRank is and how it's calculated. It's like you're not even reading my post.

I think the best answers so far are both by evariste, who's suggested two contexts -- change over time, and comparison with whoever users feel to be the competition. The rank, in other words, is context free, but people bring their own context.
posted by AmbroseChapel at 10:13 PM on May 31, 2006


Best answer: All due respect Ambrose, I think you're missing the point that Zach made.

The page rank is important for the links that come off of it, not for itself. At the risk of being one of those people whe are redundantly explaining pagerank, I will try again.

You and I have pages about Foo. Five different sites link to my page, five other sites to yours. Google counts thoses as five "votes" for each of our pages, and we get sorted together in their results.

But wait! One of us (I'll be nice and say you) gets our page linked to by slashdot and four others. Slashdot is highly regarded on the internet and doesn't often link to spam, porn or sporn, so their pagerank is higher. Which means instead of their link to you counting as "+1" it counts as "+5".

So now i had a score of five and you have a score of ten and you get sorted way higher than me and AmbroseChapelsFooSiteExtraordinare.com goes on to be internet hot spot number one while BrainysFooapalooza.net languishes in obscurity with half assed graphics and a site built in tables.

To sum up:
PageRank is "How much this pages opinion counts to google.
posted by Brainy at 10:40 PM on May 31, 2006


shhh, lets not talk about my math. or lack of apostrophies.
posted by Brainy at 10:42 PM on May 31, 2006


I think usually pagerank is used to break ties. When you do a search for most things, you'll get back millions of results. Many different components are used to order the results. As I understand it, pagerank comes after all the text matching as a way of breaking ties between the winning the results.
The thing to understand is that its tremendously complicated searching billions of documents that quickly. Really, everything is about the speed. That is the main context of pagerank.
posted by alkupe at 10:49 PM on May 31, 2006


I think maybe AmbroseChapel is asking not "what is PageRank good for?" but "why is PageRank displayed in Google Toolbar?"

I suspect the answer to that is "why not?". Yes, it's distracting, but it also gives you a tiny smidgen more information about the page. If you've bothered to install the google toolbar, then presumably you're interested in seeing that kind of nonessential-but-maybe-interesting data. If you're smackfu, then you're presumably not interested in seeing that data.

(Re the other dicsussion: I think pagerank is more usefully interpreted not as the "quality" of the page but as the "popularity" of the page (do the other web pages invite it to play web-page games?). The underlying theory of Google is that popularity, in this sense, positively correlates with usefulness or desirability. Is it a sound theory? Maybe.)
posted by hattifattener at 1:16 AM on June 1, 2006


PageRank signals the importance of the page on the internet, where importance is "how many connections it has" -- hey, just like real life. The fan page is just as an important as the IMDB site, because everyone is linking to the fan page instead of the dull IMDB page.

If you understand what PageRank is, I don't see how this can be confusing. A site with high PageRank is hugely linked to, and is therefore well-connected, and likely to be authoritative on whatever topic it is about.

It's in the toolbar so that whenever you stumble across what appears to be a shitty little site and it's got a PageRank of 9, you know there's actually something good here.
posted by bonaldi at 6:47 AM on June 1, 2006


I'm voting for smackfu's answer:

When the toolbar first came out years ago, Google was still run by the techies with no one to tell them no, so they added all the shit they could think of to it. Now they probably wouldn't include it.

If you're not a web developer trying to size up your site or a competitor's site, it's just data without context.

I don't see how PageRank matters if you're just surfing around. I doubt anybody would think, "Oh wow, this site has PageRank 7, it must be good!" As Ambrose implied in the original question, it's a tad too crude to be helpful at that level.
posted by Khalad at 6:55 AM on June 1, 2006


IT's not so much that the site is good, it's that it has something good in it, otherwise it wouldn't get all the links. Think of it a bit like the MySpace friends count. Sure, half of those friends are bands, but someone with 200 pals has got something going for them.
posted by bonaldi at 6:57 AM on June 1, 2006


PageRank is partially based on how many other, reputable sites link to the current site. Reputable meaning that those sites are also widely-linked, among other criteria (not known for spam, etc). So if I'm wondering if a page is well-connected, I could look at the PageRank and find out. Well-connected is probably a better terminology than reputable and acknowledges that there's a lot of graph theory that goes into calculating relationships between sites.
posted by mikeh at 7:30 AM on June 1, 2006


Its not data without context. The pagerank doesn't vary based on searches. The pagerank is how trustworthy the page is on linking you to content that has information people appreciate. If you're on a page, and you see a link to a brand new blog that has an interesting article that might be bunk, you check the page rank of the site that links to it and you say "well, slashboing thinks this is good so (in theory) it probably is"

This started out as a true tale and turned into an analogy:
I have a smart friend whose opinion I take seriously. If he recommends a movie, a book, a restaurant, I know it's worth my time to check out.

His mental pagerank is like 9, so if somebody asks me (acting as google) about good restaurants, I might not be able to personally vouch for any because I'm poor and can't eat out, but I'll still say "Oh, I've heard good things about John's on 2nd Ave".

I have other friends with poor taste and their pagerank is like 4 and I'm probably not going to pass those recommendations on to anybody.
posted by Brainy at 10:13 AM on June 1, 2006


Response by poster: At the risk of being one of those people whe are redundantly explaining pagerank, I will try again.

You didn't do that, Brainy, and you did manage to convince me of something rather paradoxical: PageRank is used to calculate PageRank.

It's like that old gag about the village where everyone makes their living taking in each other's washing. It makes me ask "how did the Google boys prime the pump, back in the day?".

Did they start with a million sites and a mean five million points of PageRank, and just shake their algorithm until it settled, with some people on ten and some on zero? If someone accidentally deleted Microsoft from Google's database, would everyone else's PageRank go down that little bit?

But again, you had to add context to your examples: you and I both have a site about "foo", and this is how you calculate our relative rank. But if I have a site about "foo" and you have a site about "bar"?

To put it another way, if I have a chocolate-chip-cookie website, it seems I'd be better off getting Microsoft to link to me than the person who comes up first in a search for "chocolate chip cookies", because she's only a six, and Microsoft is a nine.
posted by AmbroseChapel at 12:43 PM on June 1, 2006


But again, you had to add context to your examples: you and I both have a site about "foo", and this is how you calculate our relative rank. But if I have a site about "foo" and you have a site about "bar"?

My impression was that there is one static PR, and when you are doing a search you are pulling out all the entries based on your term and ranking them based on their rank relative to the entire internet.

Let's say there are 8 sites on the internet. 2 of them are about "foo" and 2 are about "bar." Call these sites foo_1, foo_2, bar_1, and bar_2. Say that all 4 of the other sites on our internet link to foo_1, 3 link to bar_1, 2 link to foo_2, and 1 links to bar_2, and further, that none of these other sites are linked to. So the overall, context free ranking, of the sites is
1. foo_1
2. bar_1
3. foo_2
4. bar_2
My assumption (I have never used the Google Toolbar and thus have no sense of what this number tends to be) is that this is the ranking, divided into deciles, that you see under PageRank. It really is the ranking of the page, based on the PR criteria for importance, relative to the entire internet. In my example, foo_1 would be more important than bar_1 even though they may be about totally unrelated topics (this is all of course based on a simplified PageRank in which only the number of links to a page matters).

Say you do a search on "foo" and "bar." Here's where you may think that PR is based on context. For "foo" you'll get foo_1 listed first in your results, and for "bar" you'll get bar_1 listed first. So you may think that bar_1 and foo_1 have the same PR. But my understanding of PR is that this is not the case. I'm pretty sure there's an overall PR and your search just returns as first the site with the highest PR of the sites containing your term.

To put it another way, if I have a chocolate-chip-cookie website, it seems I'd be better off getting Microsoft to link to me than the person who comes up first in a search for "chocolate chip cookies", because she's only a six, and Microsoft is a nine.

I'm almost certain this is the case. Whether or not the site linking to you is also about cookies doesn't matter. Only the raw popularity of the linking site matters. In some sense this is a reasonable way to do things and another sense it's not. It's reasonable in that sites about different topics can be reasonably compared in importance (e.g., my blog is less important than whitehouse.gov). However, one could also make an argument for ranking pages based on the authority of sites within a field. For example, I would probably trust the opinion of the top cookie website about hot new cookie sites more than I would trust the slashdot's opinion on the best new cookie websites, since the former is presumably an expert in the field and the latter is not. If my assumptions about how PR works are correct, though, since slashdot is more important overall, it's opinion would matter more in determining the ranking of new cookie websites.
posted by epugachev at 1:39 PM on June 1, 2006


The Google Papers: 1 [ps], 2 [pdf].
posted by epugachev at 1:45 PM on June 1, 2006


I'm almost certain this is the case. Whether or not the site linking to you is also about cookies doesn't matter. Only the raw popularity of the linking site matters.

This is no longer strictly the case, but it was originally. Google continually tweaks PageRank and context does matter a lot more now than it did before. "Why is Microsoft linking to a chocolate chip cookie maker when they normally link to software stuff? This link shouldn't count as much." For instance, in the latest major refresh of Google's ranking algorithm ("BigDaddy"), sites that had previously gotten unearned PageRank by exchanging reciprocal links with totally-unrelated-to-their-field websites lost all of it.
posted by evariste at 2:23 PM on June 1, 2006


Best answer: how did the Google boys prime the pump, back in the day?

The Google Pagerank Algorithm
and How It Works
presents a simple iterative algorithm to do it and walks you through it a copule times.
posted by jewzilla at 4:12 PM on June 1, 2006


Response by poster: Oh no. Now I'm going to have to give a "best answer" to someone who explained how PageRank works after all?

But seriously, that paper does indeed answer a few questions, and indeed focuses quite specifically on the paradox that PageRank is used to create PageRank.

I now have a mental picture of PageRank as some kind of mammoth, slowly-but-constantly-moving process like the tides -- and it's paradoxical only in the way that "evaporation from the sea makes the clouds, clouds make rain, rain makes rivers which flow down to the sea" is paradoxical. You have to go looking for your creation story to explain how it started.

I also have the vision that Sergei or Larry have a terminal open in their office, displaying "1.85" and that if they go to that terminal and type 1.9 or 1.8 then all hell breaks loose ...
posted by AmbroseChapel at 7:27 PM on June 1, 2006


« Older Beware la migra.   |   Which hand is my left one? Newer »
This thread is closed to new comments.