How does Google Books decide which books are open access?
December 15, 2014 7:47 PM   Subscribe

Many apparently out-of-copyright books on Google Books are open access. Others return limited sections in response to search queries. Yet others are completely inaccessible. Why?
posted by dontjumplarry to Computers & Internet (7 answers total) 1 user marked this as a favorite
Copyright laws vary internationally-- are you looking at materials published in the same country with differing access levels?
posted by jetlagaddict at 8:02 PM on December 15, 2014

Response by poster: I'm thinking of old, obscure early 19th century English books that are presumably out of copyright everywhere. I guess I'm asking, is this to do with state law? Library/university agreements? Or is Google themselves selectively restricting access?
posted by dontjumplarry at 8:08 PM on December 15, 2014

I wonder this too. I think in some cases they have records from their publishing partners but don't actually have scanned versions of the titles. I notice it a lot when they have a bunch of items in a series and some are scanned. I'll poke around because I've seen stuff that is definitely public domain, US stuff and it's still not available and I've wondered why.
posted by jessamyn at 8:10 PM on December 15, 2014 [1 favorite]

"As with all of our decisions related to the Google Books content, we're conservative in our reading of both copyright law and the known facts surrounding a particular book. If we don't know for sure that a book is in the public domain, you'll see at most bibliographic information about the book and a few short snippets - sentences of your search term in context."
posted by paleyellowwithorange at 9:07 PM on December 15, 2014 [2 favorites]

I would like to think that their decisions make sense, but I've seen books from the 1850s appear and then disappear for no apparent reason (dudes! not in copyright! not in any country!). Often, you can find the same book in full view at HathiTrust or
posted by thomas j wise at 5:30 AM on December 16, 2014 [1 favorite]

I've definitely come across very suspicious-looking situations where it appears as though some publisher has grabbed a copy of a public domain book, obtained an ISBN and otherwise arranged to sell reprints of it, and then filed some sort of takedown request with Google to try to create artificial scarcity. I've wondered if antique book dealers might try to do the same thing, having seen other situations where a book that is definitely public domain starts off open access on Google Books and mysteriously becomes unavailable at some point.

I'm probably being overly paranoid but in a couple of cases this has appeared to happen in concert with related political events, like an original source from the 1890s concerning Boston marriages becoming inaccessible after I'd mentioned it a few times in online debates about same-sex marriage. I just don't know how easy it is to get Google to block access.

So I'm very glad for the existence of HathiTrust ( and the Internet Archive and other repositories of public domain works who I'd expect to be less likely to allow sleazy stuff to happen. As thomas j wise suggests, if I go back to an old Google Books link and find it no longer works I'm usually able to find the same thing at one of those to sites.
posted by XMLicious at 6:03 AM on December 16, 2014 [1 favorite]

I think the main factor is that Google uses an algorithm, not a human being, to determine when a work was published. This can lead to bizarre results, to say the least—e.g., on older books that list the publisher's street address on the title page, a four-digit house number might be taken for the date of publication.
posted by brianogilvie at 10:55 AM on December 16, 2014

« Older Should I encourage or redirect my little kid's...   |   Bowling for ramen Newer »
This thread is closed to new comments.