List of links?
May 15, 2011 10:17 AM Subscribe
I'm looking for a sniffer, but one that only reports links that are accessed behind the curtain, and not a true packet analyzer.
Warning, I know enough to be dangerous, but I'm not an expert in this area.
First, some background. I can't stand listening to podcasts on the web and I'm always looking to get direct links so I can listen to them in my various mp3 players. Most of the time I'm successful using my browser's cache to get the links, but sometimes they use too much web mumbo jumbo to hide the links.
I'm looking for software that will catalog every connection my browser makes in text output. That way, buried in there (hopefully) will be the direct links to the files.
This really simplified example shows the kind of output I'm looking for. Basically, on my end, let's say I visit IMDB.COM and click on PODCAST, and then click on PLAY. I would hope to get output like this:
http://www.imdb.com/index.html
http://www.imdb.com/advert01.swf
http://www.imdb.com/advert02.swf
http://www.imdb.com/podcast.html
http://www.imdb.com/podcast.swf
http://www.imdb.com/podcast/audio/ep01.mp3
So it reports every link the browser made to load the page and the ads. Then when I clicked on the podcast page, ran the flash player, and it connected to the mp3 file, my (imaginary?) software recorded the link that was made.
Does it exist?
Warning, I know enough to be dangerous, but I'm not an expert in this area.
First, some background. I can't stand listening to podcasts on the web and I'm always looking to get direct links so I can listen to them in my various mp3 players. Most of the time I'm successful using my browser's cache to get the links, but sometimes they use too much web mumbo jumbo to hide the links.
I'm looking for software that will catalog every connection my browser makes in text output. That way, buried in there (hopefully) will be the direct links to the files.
This really simplified example shows the kind of output I'm looking for. Basically, on my end, let's say I visit IMDB.COM and click on PODCAST, and then click on PLAY. I would hope to get output like this:
http://www.imdb.com/index.html
http://www.imdb.com/advert01.swf
http://www.imdb.com/advert02.swf
http://www.imdb.com/podcast.html
http://www.imdb.com/podcast.swf
http://www.imdb.com/podcast/audio/ep01.mp3
So it reports every link the browser made to load the page and the ads. Then when I clicked on the podcast page, ran the flash player, and it connected to the mp3 file, my (imaginary?) software recorded the link that was made.
Does it exist?
AdBlock won't see the final item in the OP's list, though, because in his example that connection is made by Flash Player not by the browser.
But I think this is coming at it from the wrong direction. The whole point of podcasts are that they are downloadable. Look for "podcast feed" or "podcast RSS" links nearby; the feed will contain the URLs of the podcast audio files; download and enjoy.
(Or, easier: give the feed URL to iTunes or to another podcast manager application and have it download the files for you.)
posted by We had a deal, Kyle at 11:02 AM on May 15, 2011
But I think this is coming at it from the wrong direction. The whole point of podcasts are that they are downloadable. Look for "podcast feed" or "podcast RSS" links nearby; the feed will contain the URLs of the podcast audio files; download and enjoy.
(Or, easier: give the feed URL to iTunes or to another podcast manager application and have it download the files for you.)
posted by We had a deal, Kyle at 11:02 AM on May 15, 2011
Paros or Burp proxies do this. You run them on localhost and point your browser to them. They parse the transactions so it's easy to see just the URLs requested; they also can break SSL for you so you can monitor encrypted traffic.
Of the two, Paros is probably easier to set up.
posted by These Premises Are Alarmed at 12:18 PM on May 15, 2011
Of the two, Paros is probably easier to set up.
posted by These Premises Are Alarmed at 12:18 PM on May 15, 2011
AdBlock won't see the final item in the OP's list, though, because in his example that connection is made by Flash Player not by the browser.
This is not true. HTTP requests from Flash go through AdBlock, so you will see them. What you won't see is rtmp:// requests (and rtmpe://, etc.), because Flash connects to those directly. The vast majority of mp3 files are hosted on http though, so this isn't too big of an issue.
posted by Rhomboid at 1:35 PM on May 15, 2011
This is not true. HTTP requests from Flash go through AdBlock, so you will see them. What you won't see is rtmp:// requests (and rtmpe://, etc.), because Flash connects to those directly. The vast majority of mp3 files are hosted on http though, so this isn't too big of an issue.
posted by Rhomboid at 1:35 PM on May 15, 2011
Ah. Never mind me then.
posted by We had a deal, Kyle at 1:47 PM on May 15, 2011
posted by We had a deal, Kyle at 1:47 PM on May 15, 2011
if you're running linux or OSX, the urlsnarf component of dsniff will do this - it's a packet sniffer that looks specifically for HTTP requests generated by any application and dumps out the URLs...
posted by russm at 9:26 PM on May 15, 2011
posted by russm at 9:26 PM on May 15, 2011
Firefox has Page Info under the Tools menu that shows you where stuff on the page comes from (see its Media tab). But if all you want to do is snarf podcasts, you don't actually need to be searching manually through lists of links; the Video Download Helper extension will do it for you. On any page containing video or audio potentially downloadable via HTTP, the Helper will display three spinning balloons in your navigation toolbar; click on them and you get a list of things you can download. If you're on a page you know contains podcasts where VDH doesn't immediately light up, clicking the Play button on one of them is usually enough to make it do so. Small, neat, works well.
posted by flabdablet at 9:56 PM on May 15, 2011 [1 favorite]
posted by flabdablet at 9:56 PM on May 15, 2011 [1 favorite]
This thread is closed to new comments.
posted by Roger Dodger at 10:47 AM on May 15, 2011