automate this task, please
December 2, 2007 10:50 PM
Subscribe
How do I parse a few lines from several hundred word documents into a spreadsheet?
I need to go through about 700+ word documents in a folder structure 2 levels deep. From the documents, I need to pull out a few key details. I need to pull out a unique identifier contained in the cover page of the document for one column, the name of the document (which is always following the colon).
This line is always on page 2 of the word document:
UID ###: DOCNAMEGOESHERE
Where ### is where the 2-3 digit number is, DOCNAME is where the name of the document is.
The header name should be easily/uniquely found throughout the document as it is located in the document - on its own line as:
Header Name: HEADERNAME
Where HEADERNAME is where the name of the header is.
The columns in the spreadsheet are:
Header Name | UID # | UID Name | Folder Name
If someone could prod me in the right direction, that would make me happy. Let me know if this isn't feasible. I just shudder at doing this manually.
posted by gerg to computers & internet (7 comments total)
3 users marked this as a favorite
Hopefully someone can take my words here and provide a non-Unix solution.
posted by Kickstart70 at 10:56 PM on December 2, 2007