Can I legally harvest facts posted in an Internet forum, reformat them, and display them in a more user-friendly format on my web site?
December 7, 2008 3:59 PM   RSS feed for this thread Subscribe

Can I legally harvest facts posted in an Internet forum, reformat them, and display them in a more user-friendly format on my web site?

I understand that your are not my attorney. I have also searched the Ask MeFi archives and found posts related to forum postings, but not specifically facts contained within them.

The situation is as follows: There are several forums that display user-contributed data in a very disorganized manner. The data are summarized in a particular format in forum postings.

I have created a template that presents this data in a much more user-friendly format. The template also includes a form so that users can add additional data.

However, in order to incentivize people to visit the site, I will have to have some data already posted. As far as I know, the only sources for this data are the forums I described.

I recognize that data cannot be copyrighted. Am I free to pull this data from the relevant web sites, reformat it, and publish it on my own web site, even if they claim that all of their content is copyrighted? What if the terms of service of the forum prohibit this?
posted by underdetermined to law & government (11 comments total) 1 user marked this as a favorite
Just to clarify, I'm not talking about mining all of the data; just a selection of it. But let's disregard fair use, as it applies to quotations, for the sake of this question. Also, I am aware of Feist v. Rural.
posted by underdetermined at 4:01 PM on December 7, 2008


"Facts" published in an internet forum? You think that's what gets posted there?

In all seriousness, it's going to be hard to give a straight legal analysis here without knowing a bit more about the situation. What kind of information are we talking about? What kind of forum?

This sounds suspiciously like a law school hypothetical to me...
posted by valkyryn at 4:43 PM on December 7, 2008


I'm not a lawyer, and if you have real qualms about doing this then go talk to one.

valkyryn is right in saying that we need to know what kind of facts you're after in order to give you real advice.

Assuming you mean actual fact facts, as in things that can be proven true (ie there are 12 inches in a foot), then I don't see a problem with it. Maybe cite your source if you feel like being nice. But facts like that are facts, and there's the whole common knowledge thing, and not having to cite sources with that.

If it's a fact about that forum (ForumA has 12,573 members as of 5 July 3096) then you might run into some trouble. But again, if it's public knowledge (and some forum statistics are), then you should be fine.
posted by theichibun at 4:52 PM on December 7, 2008


Maybe. If these forums post just bare facts, as an unoriginal selection, in an unoriginal presentation, then yes, as far as I can tell you should be able to legally harvest them. Just because the forum owner worked hard for the facts does not mean they get any copyright protected. What you need is some sort of original expression of the facts: a selection, an arrangement, an interpretation, etc.

Also, I don't see how a TOS or such could trump your rights under copyright law. Facts are in the public domain. If people could somehow make this not so, they would do it!

This article about AP Polls has some interesting discussion.

IANAL.
posted by sbutler at 4:56 PM on December 7, 2008


Here's an example:

If you logged onto, say, a Neil Young forum, and poster Bob has a post with the songs Neil played in Ottawa the night before, you can copy and paste those songs and do whatever you want with them. The list of songs played are facts.

If, however, you cut and paste Bob's review, or his comments after the songs, then you are not just taking facts. Then it's about opinion.

About five years ago an online forum I was part of won a successful lawsuit against someone who was harvesting user opinion as posted in the online forum without permission or attribution. Again, not just facts, opinion and commentary.

we now have blatant blinking (well, almost) disclaimers everywhere that the words of the individuals who post belong to those individuals and cannot be reposted, but that's just so we don't have to waste time with thieves again (or make the whole process easier).-

I also had a website with facts AND opinion in a format that is now very common, but wasn't common when we started the website. someone decided that they didn't like our format and they were going to take our 'facts'. again, facts are facts, but when you take opinion and rewrite it like you'd rewrite someone else's homework to make it SEEM different (use "do not" instead of 'don't" and the like), you also won't have a case.

My question to you is this: if you're SO sure you're on the up and up, why on earth don't you reach out to the people posting this information, and the forum aggregating them now, and say, "i think there's a better way to do this, I've done this work to make things better, here it is, i'd love it if everyone would contribute since of course i wouldn't want to exploit the community here, my only motive is to make the information easier to access" and then invite folks to do so. Or, then just plain ask the forum if you can use their information. even if you're legally covered, why not just be courteous and ask?

the fact that all you want to know is whether or not you can get sued for what you want to do makes it seem somewhat suspicious to me. if you're so sure you're on the up and up you would have just done it already.
posted by micawber at 5:35 PM on December 7, 2008


Facts, yes. Opinions or creative work, no.
posted by zippy at 6:00 PM on December 7, 2008


Micawber makes a lot of good points. Considering (generally speaking) that most forums consist of opinion sprinkled with facts, I'd have to say that what you are referring to sounds like information scraping, which is a practice frowned upon not only by website owners but also by the search engines who will penalize your site in rankings. If you're pulling information from a website without providing a link back to that site or at the VERY least citing the source, then you are essentially stealing content from those sites to populate your own site. I think that if you have solid intentions of helping people process this information better, and not just scrape the content to get your own site moving along, then you need to have open relationships with the sites that you pull the information from.
posted by ISeemToBeAVerb at 6:11 PM on December 7, 2008


Even harvesting "facts" is not in the clear.

Someone gathered these facts and that took effort. For example the 15 minute delay on market quotes is because, while stock prices are "fact" they are the property of the stock exchange. Photos are also "Facts" but they are copyright of the photographer, etc. Major League Baseball is notorious for suing anyone who uses game stats without permission.

If you're trying to prime your site with content you're doing it wrong and if I found out in anyway that the content on your community site had been pilfered I'd go out of my way to bring you down.

Do it honestly. Contact the authors of the original content and invite them to your site. Ask them, and possibly even pay them (something small, give them enhanced membership or something) to participate. you don't say what it is, but you might try something askew like craigsList or Mechanical Turk for the content you're after.

I primed a site with Mechanical Turk and it was pretty good. I had to hand filter a lot, but the content was pretty good, honest, and was a small part of my promotional budget.
posted by Ookseer at 6:24 PM on December 7, 2008


In and of themselves, facts cannot be copywritten. Their particular format (and material other than the facts themselves) can be.

If you take great care to be sure you are taking only facts, you won't lose a lawsuit. That doesn't mean you won't get sued (for that!), it just means you won't lose.

Ookseer's reasoning is incorrect. Stock quotes cannot be immediately disseminated because of contractual obligation, not because of copyright law. Photographs are inherently artistic compositions (according to US case law).

MLB sues because they can, not because there is any violation of copyright, although there may be a contractual element also for someone who actually attended a game.

If the terms of service of the forum prohibit it, and the data cannot be accessed without first agreeing (by signing up for an account, for example), the forum owner could sue and possibly win damages for violating their terms by misusing their site, but not for violation of copyright.

(I am not a lawyer, and I'm talking US Copyright law..if you actually listen to me you are an idiot)
posted by wierdo at 3:12 AM on December 8, 2008


IANAL, but I do have a casual interest in intellectual property law. An idea can not be copyrighted, but an expression of an idea (the wording, organization, images, etc.) can be copyrighted. Facts can not be copyrighted, but (and I believe there may some legal uncertainty about this) collections of facts are protected. For example, my name and phone number are facts not protected by intellectual property law, but a phone directory may be copyrighted by its publisher as an organized collection of facts. Databases are protected in the same way, because the collectors/authors of the data have added intellectual value by organizing it.

In short, as long as you are not using the words of the original posters on that forum, you are probably OK. (I don't really see how you could do so in an automated fashion, however.)
posted by paulg at 7:11 AM on December 8, 2008


Also, I don't see how a TOS or such could trump your rights under copyright law.

If underdetermined agreed to a TOS before accessing the forum data, he might have agreed to limits on what he could do with the data which were narrower than what the law would otherwise allow. It doesn't "trump" his rights; in such a case he would have voluntarily accepted limits to his rights.
posted by DevilsAdvocate at 10:01 AM on December 8, 2008


« Older How old were you when you lear...   |   How much money does an artist ... Newer »

You are not logged in, either login or create an account to post comments