Join 3,557 readers in helping fund MetaFilter (Hide)


Extracting Data from SEC's EDGAR system: How (w/o XBRL)
June 4, 2013 9:15 AM   Subscribe

Trying to run some value-investing analyses, and running into a huge roadblock. I want to extract SEC filings (via the EDGAR database) on a daily basis and have the data inserted into a MySQL database. I'm basically looking at forms 10-Q, 10-k, DEF-14a, and forms 3, 4, and 5. After the initial set-up, I'd only need to d/l forms that have changed (via change of timestamp, I'm assuming).

I can handle the MySQL end, but I'm having difficulty figuring out how to retrieve the data from the SEC itself w/o having to physically download every single file.

Also, due to significant reporting errors, I'd really like to avoid using XBRL.

Any advice?
posted by NYC-BB to Technology (6 answers total) 2 users marked this as a favorite
 
I have some experience with this type of thing but not much experience with this specific system.

What's your access to the files? FTP? HTTP? What does the list of files look like? Are there timestamps in the file names or otherwise available?
posted by RustyBrooks at 10:19 AM on June 4, 2013


I would absolutely call them. I'm sure they would be helpful
posted by JPD at 10:51 AM on June 4, 2013


This is a question that would be well suited for the OpenData Stackexchange site.
posted by phearlez at 10:59 AM on June 4, 2013


Libraries/modules are available for:
Ruby
Python
posted by rhizome at 1:42 PM on June 4, 2013


I just helped someone do this for Form D's. Just automate downloading of the XML files.
posted by wongcorgi at 3:11 PM on June 4, 2013 [1 favorite]


Thanks everyone!
posted by NYC-BB at 12:04 PM on June 6, 2013


« Older Spending a few days in Clevela...   |  I've been thinking of starting... Newer »
This thread is closed to new comments.