Extracting Data from SEC's EDGAR system: How (w/o XBRL)
Trying to run some value-investing analyses, and running into a huge roadblock. I want to extract SEC filings (via the EDGAR database) on a daily basis and have the data inserted into a MySQL database. I'm basically looking at forms 10-Q, 10-k, DEF-14a, and forms 3, 4, and 5. After the initial set-up, I'd only need to d/l forms that have changed (via change of timestamp, I'm assuming).

I can handle the MySQL end, but I'm having difficulty figuring out how to retrieve the data from the SEC itself w/o having to physically download every single file.

Also, due to significant reporting errors, I'd really like to avoid using XBRL.

Any advice?
I have some experience with this type of thing but not much experience with this specific system.

What's your access to the files? FTP? HTTP? What does the list of files look like? Are there timestamps in the file names or otherwise available?
posted by RustyBrooks at 10:19 AM on June 4, 2013

This is a question that would be well suited for the OpenData Stackexchange site.
posted by phearlez at 10:59 AM on June 4, 2013

Libraries/modules are available for:
posted by rhizome at 1:42 PM on June 4, 2013

Best answer: I just helped someone do this for Form D's. Just automate downloading of the XML files.
posted by wongcorgi at 3:11 PM on June 4, 2013 [1 favorite]

Response by poster: Thanks everyone!
posted by NYC-BB at 12:04 PM on June 6, 2013

