How do you use paired comparisons to rank a large set of items?
What strategies give useful aggregate ranking information when asking people to pick a favorite between two items selected at random (or close to random) from a large set of possible items?

Like that website where it shows two pictures of a cat, and you pick your favorite one. Or the movie site that does the same thing.

What do they do to extract as much meaningful information from these questions as possible?

Are tiers a part of it? Like, say cats with wins >= X reach tier 2, and only cats from the same tier are shown next to each other.

Is that a thing? Would such a selection strategy make results from the comparison more useful? What other techniques are used to make the most of the limited information given by such a comparison?
I think closer to the original question: Elo ratings and various newer related developments, linked from Wikipedia.
