Launching a website that contains public government data
March 28, 2008 6:46 PM   Subscribe

Where can I find information about the legality of using public government data on my website?

I came across the website EveryBlock.com, which takes data from city governments and posts them on its website in a nice user interface. These include things like dates and locations of crimes, liquor license violations, and so on. (I have also seen websites that allow you to search for sex offenders in your neighborhood.) I am interested in a vaguely related idea that would involve using public government data that is already available on the internet. What are the rules about copying data from government websites and using it for a private website? It seems legal since EveryBlock is doing it without apparent problems, but I'd like to know how I can go about ascertaining this if I want to use a particular city or state's data. (Note: I'm just looking for general information on this topic, and not specific legal advice about my idea.)
posted by lunchbox to Law & Government (8 answers total)
 
All federal work is in the public domain. I assume this applies to state and local government, but I'm not positive.
posted by null terminated at 6:57 PM on March 28, 2008


Slight tangent: I doubt that EveryBlock is scraping the data from other web sites. They've probably bought (or otherwise obtained) a database, or are using an API to access the info. (This probably also means that the terms for using the info were provided to them by their source).
posted by winston at 6:59 PM on March 28, 2008


null terminated: I think the federals are the only folks required to have their stuff in the public domain.
posted by the dief at 8:09 PM on March 28, 2008


I just had to research this for work and things done by Federal employees in the course of their work is in the public domain (excepting some logos and a few other things that you are not allowed to use) but state and local are not necessarily the same. You'd be best to check your local statutes to find out about state and city government sites.

For instance, IANAL, but I see the site you linked to is formatting the data in their own way. They are not copying the city of San Fran's website verbatim nor using their SFGOV logo, etc. It looks like all the information they are posting is available by public record (business permits, etc.). However, the city might take offense if they posted an exact copy of their website, pics and logos (it does have a copyright at the bottom).

I spent $120 to speak with a copyright attorney once about a product I was licensing and it was the best money I ever spent. He told me things in plain English that I would not have known to look for on the US Government's copyright site.
posted by Marie Mon Dieu at 3:24 AM on March 29, 2008


I assume this applies to state and local government, but I'm not positive.

It does not apply to state and local governments, but there may be other reasons that the work is not copyrightable subject matter. Also, there is a possibility that EveryBlock's specific arrangement of the works (whether public domain or not) may be subject to protection.

I am an intellectual property lawyer, but I am not your lawyer. The above is not legal advice.
posted by anathema at 5:50 AM on March 29, 2008


there may be other reasons that the work is not copyrightable subject matter

You'd be hard-pressed to find anyone knowledgeable on the matter who will tell you that raw data can be copyrighted. Clever arrangement of the data can be, but the data can't be.

I also think that NASA can claim copyright on things it makes, even if it's a branch of the federal government.
posted by oaf at 7:50 AM on March 29, 2008


I also think that NASA can claim copyright on things it makes, even if it's a branch of the federal government.

Not so. The restrictions have to do with use of marks and sponsorship.
posted by anathema at 10:13 AM on March 29, 2008


I point to Numbrary and (my own) infochimps.org as two sites that are doing similar things. Infochimps was designed to host and distribute exactly this kind of data. Also, theinfo.org is a burgeoning community for us data nerds.

As for laws on copyright & data: two great resources are iusmentis on database law and bitlaw on compilations and databases.

Government or not, a comprehensive assemblage of facts cannot, in general, be copyrighted. My non-lawyer but well-investigated understanding (following only applies to the US, where the database laws are actually more liberal than elsewhere; I have no idea what the situation is outside the US): Copyright only applies where there is 'creative' content. A comprehensive list of cars and retail prices cannot be copyrighted; a comprehensive collection of *reviews* of those cars can be copyrighted. This is the important Feist Publications v. Rural Telephone Service case:
"Facts, whether alone or as part of a compilation, are not original and therefore may not be copyrighted. A factual compilation is eligible for copyright if it features an original selection or arrangement of facts, but the copyright is limited to the particular selection or arrangement. In no event may copyright extend to the facts themselves." -- Sandra Day O'Connor for the Supreme Court

"A collections of facts are not copyrightable per se ... A compilation, like any other work, is copyrightable only if it satisfies the originality requirement ("an original work of authorship"). Facts are never original, so the compilation author can claim originality, if at all, only in the way the facts are presented. The facts must be selected, coordinated, or arranged "in such a way" as to render the work as a whole original." -- Sandra Day O'Connor for the Supreme Court
A presentation of data can be creative -- you can't xerox the blue book and hand that out. However, a conversion of data into your own creative presentation satisfies this restriction. So would a presentation (original or converted) that did not arise from a creative act -- you couldn't claim copyright on a .CSV file of some dataset.

Besides "presentation" and a couple edge cases (such as "hot news" or "selection and arrangement"), the main one to be aware of is "Terms of Service"... If you have to agree to terms of service that restrict the data, but you take it anyway, you can be guilty of trespass. My understanding there is that if you can a) access the site by robot (no person clicks anything) AND b) there is no robots.txt, they can't sustain a claim that it's a restricted resource.
posted by mrflip at 11:57 PM on April 1, 2008


« Older Is Big Sister Watching?   |   Can has higher education? Newer »
This thread is closed to new comments.