Have paragraphs. Want just the hyperlinks. In Excel.
May 19, 2010 8:10 AM Subscribe
Need list of trade publications. Found huge one on Ulrich's web. How can I remove all information except for the (hyperlinked) titles, ideally in a way that keeps the titles on separate lines so I can then paste them into Excel?
A custom search on Ulrich's web gave me a list of almost 1,300 English language trade publications, circulation between 50,000 and 9,999,999 (their max). Each publication listing contains additional information (publisher, country, ISSN, start year, status, price). Though it looks sort of like a chart online, I can't highlight by column, and when I copy it into MS Word 2003, I get a sort of paragraph thing. Example below (the [ ] is a checkbox, and the first part (title) is a hyperlink):
[ ] AAII Journal
American Association of Individual Investors
United States
0192-3315
1979
Active
See Full Record
AAP News
[ ] American Academy of Pediatrics
United States
1073-0397
1993
Active
USD 138.0
There are also sometimes little icons between the checkbox and the the title. To go through 1297 publications and remove all the additional information seems a huge waste of time. I'm convinced there's a more efficient way, but I can't figure out what it is and Google hasn't helped.
I thought of pasting it into Excel directly and using macros to remove 6 out of every 7 lines, but it's probably a bad sign that even using "Clear Contents" on the entire sheet does not remove the checkboxes or icons.
Again, the titles are the only hyperlinked text, and I want to keep them on separate lines. The goal is to end with a list of all 1297 titles in an Excel column and nothing else.
Finally, I've tried making a list on Ulrich's web of the titles I want but that format doesn't look any more workable. Does anyone who is familiar with the site know a way to manipulate the data there in a way that would make this easier?
posted by randomname25 to computers & internet (10 answers total)
posted by soelo at 8:22 AM on May 19, 2010