a real Peregrine Pickle
August 25, 2024 8:03 AM   Subscribe

Why is Tobias Smollett (1721-1771) so popular on Project Gutenberg?

If you look at the top downloaded authors and books on PG, Tobias Smollett - and specifically, three of his novels (Ferdinand Count Fathom, Humphry Clinker, and Roderick Random) - are consistently some of the most downloaded books on the site. On all three top-author lists, he's the second most-downloaded person on there, ahead of Dickens, Twain, Austen, Melville, etc., and second only to Shakespeare.

No shade on Smollett, but I don't usually hear him mentioned in the same breath as the other top 10 authors. Is there a large but quiet Smollett Society somewhere? Did someone decide that those three books are ideal for training 18th-century chatbots? Theories welcome! Actual evidence even better!
posted by theodolite to Grab Bag (13 answers total) 5 users marked this as a favorite
 

this is a description from a [Melbourne:] recent presentation:
Smollett’s time in Jamaica and his interest in global trade gave him an important vantage on how advances in imperial citizenship law, such as those made by the Plantation Act (1740) [wiki], underwrote the campaign to extend naturalization rights to minority communities in England.
perhaps more than the other authors mentioned, Smollett’s significantly relevant to current Scots efforts to join the European Union (again) [gov.scot]
posted by HearHere at 8:28 AM on August 25 [2 favorites]


My theory would be that it’s much harder to find used or new paperbacks of Smollett than any of the others you mentioned. He’s got name recognition but if you actually want to read him you basically have to go to Gutenberg, unlike others.
posted by demonic winged headgear at 8:35 AM on August 25 [10 favorites]


Could there be some class where his books are a useful source? That plus demonic winged headgear's idea might drive significant demand, if it's at multiple colleges.
posted by dick dale the vampire at 8:37 AM on August 25 [2 favorites]


Maybe there's some reinforcement effect, where once he got on the list some portion of people looking at the list were all "who is this person?" and downloaded his books (or just followed the link to take a quick look at the web version) out of curiosity.

Incidentally, his wikipedia page has this nice note:
Laurence Sterne, in his A Sentimental Journey Through France and Italy, refers to Smollett under the nickname of Smelfungus, due to the snarling abuse Smollett heaped on the institutions and customs of the countries he visited and described in his Travels Through France and Italy.
posted by trig at 9:54 AM on August 25 [4 favorites]


Response by poster: One of the things that's so odd is that it doesn't seem to be Smollett in general, but just those three books I mentioned. For example, if you look at the most downloaded books by Charles Dickens, they descend like this:

A Tale of Two Cities - 16,017 downloads (in the last 30 days)
Great Expectations - 12,527 downloads
A Christmas Carol - 9,521 downloads
Another edition of A Christmas Carol - 7,600 downloads
Oliver Twist - 6,074 downloads
David Copperfield - 4,855 downloads

..which looks like ordinary organic interest in a popular author. But the Smollett page looks like this:

The Adventures of Ferdinand Count Fathom - 42,841 downloads
The Expedition of Humphry Clinker - 41,629 downloads
The Adventures of Roderick Random - 41,445 downloads
The Adventures of Peregrine Pickle - 376 downloads(!)
A Philosophical Dictionary (by Voltaire, with Smollett as a commentator) - 268 downloads

There's no middle ground between the apparently massive popularity of those three books and everything else, and the fact that the number of downloads for those three is almost the same also seems strange.
posted by theodolite at 10:23 AM on August 25 [4 favorites]


Given how close those numbers are, I suspect that somebody is downloading those books with a script and probably has messed up somewhere down the line.
posted by Tell Me No Lies at 12:16 PM on August 25 [7 favorites]


What Tell Me No Lies said. My first thought was that it was either somebody’s automated test case, or a script had gone bonkers.
posted by graphweaver at 5:01 PM on August 25


Is there a chance it's part of a corpus to write fake Jane Austen? Could be any additional downloads of JA just come across as noise while for Smolett they really juice the numbers.
posted by fiercekitten at 7:44 PM on August 25


That is such a good question.

English is my main, learned language, so I operate on the assumption that even though I love the London Review of Books, I just may not be aware of some author who is a tentpole of English Literature.

I really like Margaret Oliphant - but no-one seems to read anything of hers. So I just assumed Smollett was about the same.

I keep meaning to have a crack at "The Faerie Queene" by Spenser, which is some major allegorical work, probably requiring more knowledge of English politics than I probably have. On the other hand, Hilary Mantel may have given me a basic compass.
posted by Barbara Spitzer at 2:19 AM on August 26 [1 favorite]


I keep meaning to have a crack at "The Faerie Queene" by Spenser.

Just for the record that was written around 1600 A.D., roughly the same time that Shakespeare was active. Many modern English speakers have trouble understanding either. Both of them include elements of Middle English and that language is dead today.

However, if you speak German you may be in luck understanding Middle English.
posted by Tell Me No Lies at 6:12 AM on August 26


Speaking personally, I read about Smollett somewhere (I forget where), was curious to read some and my local library didn't have a lot, if any. So Gutenberg was the only source I could find, and as it's easy to download from there to my Kindle, that's what I did as it's essentially cost-free. Have I read what I downloaded? No I haven't...
posted by altolinguistic at 7:06 AM on August 26 [1 favorite]


Given how close those numbers are, I suspect that somebody is downloading those books with a script and probably has messed up somewhere down the line.

Alternate theory: someone used those three books in a code example somewhere, perhaps to demonstrate Gutenberg's API or similar. Or ChatGPT decided these three books are the ones to include when someone asks it for code to do the same.

Is there a chance it's part of a corpus to write fake Jane Austen? Could be any additional downloads of JA just come across as noise while for Smolett they really juice the numbers.

I think this is fairly unlikely, people probably aren't training AustenGPT several thousands of times and needing to re-download the books each time.
posted by BungaDunga at 8:25 AM on August 26


Thanks for the tip on "Faerie Queen" - my spoken German is rusty, but I can read it well, so that is encouraging.
posted by Barbara Spitzer at 2:58 AM on August 27


« Older Excess Eggplants   |   Changing careers when you don't know what your... Newer »

You are not logged in, either login or create an account to post comments