Skip

How to track the sources a website pulls news from?
September 29, 2012 5:41 PM   Subscribe

I'm wondering if anyone could suggest a tool or website that allows me to see a list of sites that a particular website is pulling news from. I know I can do this manually by looking at articles and noting the sites that are sourced, but I'm wondering if there is an easier way to do it. Any ideas? Thanks, - Michael
posted by ISeemToBeAVerb to Computers & Internet (2 answers total)
 
If you have access to a UNIX command line, something like:

wget -O http://particular_website.com | grep -cf file_with_root_website_addresses.txt > output.txt

where file_with_root_website_addresses.txt would look like:
http://*.nbcnews.com/
http://www.reuters.com
http://www.eweek.com
...etc
...for each news agency you were looking for.
Since this is more of a hint than explicit instructions, here's the wget manual and the grep manual
posted by Orb2069 at 11:33 AM on September 30, 2012


Thanks Orb2069, that sounds promising, I'll give it a shot.
posted by ISeemToBeAVerb at 12:51 PM on September 30, 2012


« Older We have a limestone house that...   |  My sister has a dental emergen... Newer »
This thread is closed to new comments.


Post