<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: Hey, watch this!</title>
	<link>http://ask.metafilter.com/54327/Hey-watch-this/</link>
	<description>Comments on Ask MetaFilter post Hey, watch this!</description>
	<pubDate>Wed, 03 Jan 2007 19:31:36 -0800</pubDate>
	<lastBuildDate>Wed, 03 Jan 2007 19:31:36 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: Hey, watch this!</title>
		<link>http://ask.metafilter.com/54327/Hey-watch-this</link>	
		<description>Is there a way to monitor internal (i.e. behind the firewall) web pages for changes?  I&apos;m looking for a piece of software that works like &lt;a href=&quot;http://www.watchthatpage.com/&quot;&gt;WatchThatPage&lt;/a&gt;, but running from my PC (which is on the corporate LAN via VPN) instead of as an external service. &lt;br /&gt;&lt;br /&gt; This would be a nifty utility if it exists.  I use information aggregation services (like &lt;a href=&quot;http://mileagemanager.com&quot;&gt;MileageManager&lt;/a&gt; to track my frequent flyer miles) but they require me to put in my username and password for each site, and I&apos;m not comfortable doing that with, for example, financial info (or risking getting fired for putting my corporate login info out on third party provider).  I have cookies enabled so I can login to lots of sites automatically and such a service, running from my own PC, ought to enable me to access such sites (and check for changes to specific web pages) as well, something WatchThatPage can&apos;t do.</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2007:site.54327</guid>
		<pubDate>Wed, 03 Jan 2007 18:57:36 -0800</pubDate>
		<dc:creator>JParker</dc:creator>
		
			<category>web</category>
		
			<category>site</category>
		
			<category>monitoring</category>
		
			<category>software</category>
		
	</item> <item>
		<title>By: hincandenza</title>
		<link>http://ask.metafilter.com/54327/Hey-watch-this#818010</link>	
		<description>So, you need to clarify what you&apos;re looking for, and what it is you&apos;re monitoring and what that means.  The example you gave is an external site, but you mention in the main post monitoring internal pages, which I assume means intranet/hosted on the local network.  Can you clarify that?&lt;br&gt;
&lt;br&gt;
At some level, you&apos;re either going to script something (&lt;i&gt;which isn&apos;t hard, a quickie vbscript using winhttp will work fantastically well&lt;/i&gt;), or use macro-based software to drive an actual browser session (&lt;i&gt;if you&apos;re going to rely on saved cookies, etc&lt;/i&gt;).  There may be tools to auto-grab webpages, but a script is easy, and once you have it you can make it do just about anything you want.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.54327-818010</guid>
		<pubDate>Wed, 03 Jan 2007 19:31:36 -0800</pubDate>
		<dc:creator>hincandenza</dc:creator>
	</item><item>
		<title>By: JParker</title>
		<link>http://ask.metafilter.com/54327/Hey-watch-this#818049</link>	
		<description>I would be happy just to be notified by email on a daily basis when the content of a particular web page has changed.  My company has thousands of web pages, and over a dozen separate sites focused on what I do, so checking them manually is just impossible.&lt;br&gt;
&lt;br&gt;
Content aggregation would be a bonus, but that&apos;s clearly more fully developed app functionality and not a requirement.&lt;br&gt;
&lt;br&gt;
Your suggestions both seem feasible, but I don&apos;t know vbscript and don&apos;t know of any pre-built browser add-ins that do this.  Basically you&apos;re saying this is something I would have to write myself?  If so, well ... continuing my self-education is on my new year&apos;s resolutions list.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.54327-818049</guid>
		<pubDate>Wed, 03 Jan 2007 19:59:35 -0800</pubDate>
		<dc:creator>JParker</dc:creator>
	</item><item>
		<title>By: hincandenza</title>
		<link>http://ask.metafilter.com/54327/Hey-watch-this#818149</link>	
		<description>So, if the sites internally don&apos;t require login (or even if they do, actually), the winhttp object in vbscript is &lt;i&gt;incredibly&lt;/i&gt; easy to use- I keep recommending it for just these kinds of questions.&lt;br&gt;
&lt;br&gt;
Basically, I would approach this as a two-pronged problem.  One, something that will go out, grab the page, and save it to file, for a list of many files.  Two, a separate batch or whatever that will do a compare of yesterday&apos;s pages to today&apos;s, and notify you if there are any differences.  &lt;br&gt;
&lt;br&gt;
If you have thousands of pages to check (&lt;i&gt;although at that point, you&apos;d really want to be doing webserver log parsing to look for failed pages- is the point simply to find out whenever any page changes?&lt;/i&gt;), and want to know when they change on a daily basis, here&apos;s how I would do it:&lt;blockquote&gt;1) Set up a schedule task from your PC once a day to run a vbscript that will consume a list of URLs from a file, and do a simple vbscript that will output the page response to file.  The list of sites would have to be something like this as list.txt.  Each line would be some kind of friendlyname (&lt;i&gt;no spaces or special characters&lt;/i&gt;), a comma, and then the actual link:&lt;code&gt;&lt;br&gt;
GoogleHome,http://www.google.com&lt;br&gt;
YahooHome,http://www.yahoo.com&lt;/code&gt;&lt;br&gt;
&lt;br&gt;
I&apos;ve whipped up a  simple vbscript that does just this, which I&apos;ll post shortly- you can go the &lt;b&gt;&lt;a href=&quot;http://msdn.microsoft.com/scripting&quot;&gt;Windows Scripting Center&lt;/a&gt;&lt;/b&gt; to get loads more info including a great Windows Script Host 5.2 downloadable help file that is an incredible resource for quickly learning vbscript.  Search on winhttp if you want to learn more sophisticated tricks of winhttp.&lt;br&gt;
&lt;br&gt;
2) Have a totally &lt;i&gt;separate&lt;/i&gt; scheduled job that simply does a diff of the files from one day&apos;s scan to the next (&lt;i&gt;if the file is a different size is probably good enough&lt;/i&gt;).  That job is responsible for finding differences, and then emailing you, or otherwise leaving a &quot;what&apos;s different&quot; list of files that you can easily check.  It could be as simply as a batch file that does &quot;for every file in one folder, find the same file in this other folder and see if they&apos;re different&quot;, or simply using the windiff or fc utilities to do the same thing.  I&apos;ll leave that as an exercise for you to solve- it shouldn&apos;t be too hard.&lt;/blockquote&gt;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.54327-818149</guid>
		<pubDate>Wed, 03 Jan 2007 21:45:20 -0800</pubDate>
		<dc:creator>hincandenza</dc:creator>
	</item><item>
		<title>By: hincandenza</title>
		<link>http://ask.metafilter.com/54327/Hey-watch-this#818152</link>	
		<description>Here is the vbscript.  Save this text as a file called crawler.vbs somewhere on your system, along with a file called list.txt in that same folder where you put the list of URLs as I define them above.  You might even try the simple two-URL list I have of google and yahoo, to try it out. &lt;br&gt;
&lt;br&gt;
Then create a folder for storing the pages, such as d:\mypages.  You should see you can adjust this location easily by editing the script in the obvious place at the beginning.&lt;br&gt;
&lt;br&gt;
Now just run it, preferably from a command line- you should see it create a subfolder based on the currenttime in the parent folder you created above.  Every page in the list should be downloaded.  Now, this script has virtually no error handling, but that&apos;s something you can do- I think you&apos;ll find that vbscript is straightforward and easy to use, and you&apos;ll pick it up real quick with this as a basis to get started.&lt;br&gt;
&lt;code&gt;&lt;br&gt;
&apos; Define root folder- this is where each day&apos;s &quot;scan&quot; will be stored&lt;br&gt;
RootFolder	= &quot;D:\myPages\&quot; &apos; this is the storage folder&lt;br&gt;
myList		= &quot;list.txt&quot; 	&apos; this is the list file&lt;br&gt;
&lt;br&gt;
&apos; Create necessary COM objects&lt;br&gt;
set FSO		= CreateObject(&quot;Scripting.FileSystemObject&quot;)&lt;br&gt;
Set myHttp 	= CreateObject(&quot;WinHttp.WinHttpRequest.5.1&quot;) &lt;br&gt;
  &apos; on windows XP; if this errors out, try removing the .1&lt;br&gt;
&lt;br&gt;
&apos; Get current date, reformat it as the folder name mypages\YYYYMM_HHMM\, and create that folder&lt;br&gt;
myFolderName	= RootFolder &amp;amp; mytimeStamp(Now)&lt;br&gt;
Set myFolder	= FSO.CreateFolder(myFolderName)&lt;br&gt;
&lt;br&gt;
&apos;open the listfile&lt;br&gt;
Set myURLs	= FSO.OpenTextFile(myList, 1)&lt;br&gt;
&lt;br&gt;
&apos; Loop through the list until the file ends, i.e &quot;atendofstream&quot;&lt;br&gt;
On Error Resume Next&lt;br&gt;
Do while Not myURLs.AtEndOfStream&lt;br&gt;
&lt;br&gt;
	&apos; split the line into a friendly name and the URL&lt;br&gt;
	myEntry		= myURLs.ReadLine&lt;br&gt;
	arrItems	= Split(myEntry, &quot;,&quot;, -1, 1)&lt;br&gt;
	myFriendlyName	= arrItems(0)&lt;br&gt;
	myLink		= arrItems(1)&lt;br&gt;
&lt;br&gt;
	myHttp.Open &quot;GET&quot;, myLink, False&lt;br&gt;
	myHttp.Send&lt;br&gt;
	&lt;br&gt;
	&apos; get the http status code and response text&lt;br&gt;
	myResponseCode	= myHttp.Status&lt;br&gt;
	myResponseText	= myHttp.ResponseText&lt;br&gt;
&lt;br&gt;
	&apos;write the output to the file \friendlyname.txt&lt;br&gt;
	myOutputName	= myFolderName &amp;amp; &quot;\&quot; &amp;amp; myFriendlyName &amp;amp; &quot;.txt&quot;&lt;br&gt;
	Set myOutputFile	= FSO.CreateTextFile(myOutputName, True)&lt;br&gt;
	myOutputFile.WriteLine myResponseCode &amp;amp; vbCRLF &amp;amp; myResponseText&lt;br&gt;
	myOutputfile.Close&lt;br&gt;
	Set myOutputfile	= nothing&lt;br&gt;
&lt;br&gt;
Loop&lt;br&gt;
&lt;br&gt;
wscript.quit&lt;br&gt;
&lt;br&gt;
&apos;-------------&lt;br&gt;
&lt;br&gt;
&apos; simply formats a date-time folder name&lt;br&gt;
Function MytimeStamp(curTime)&lt;br&gt;
  myTime	= curTime&lt;br&gt;
&lt;br&gt;
  myYear	= DatePart(&quot;yyyy&quot;, myTime)&lt;br&gt;
  myMonth	= DatePart(&quot;m&quot;, myTime)&lt;br&gt;
  if Len(myMonth) = 1 then myMonth = &quot;0&quot; &amp;amp; myMonth&lt;br&gt;
  myDate	= Datepart(&quot;d&quot;, myTime)&lt;br&gt;
  if Len(myDate) = 1 then myDate = &quot;0&quot; &amp;amp; myDate&lt;br&gt;
  myHour	= DatePart(&quot;h&quot;, myTime)&lt;br&gt;
  if Len(myHour) = 1 then myHour = &quot;0&quot; &amp;amp; myHour&lt;br&gt;
  myMin	= DatePart(&quot;n&quot;, myTime)&lt;br&gt;
  if Len(myMin) = 1 then myMin = &quot;0&quot; &amp;amp; myMin&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
  myTimeStamp	= myYear &amp;amp; myMonth &amp;amp; myDate &amp;amp; &quot;_&quot; &amp;amp; myHour &amp;amp; myMin&lt;br&gt;
&lt;br&gt;
End Function&lt;br&gt;
&lt;/code&gt;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.54327-818152</guid>
		<pubDate>Wed, 03 Jan 2007 21:50:30 -0800</pubDate>
		<dc:creator>hincandenza</dc:creator>
	</item><item>
		<title>By: JParker</title>
		<link>http://ask.metafilter.com/54327/Hey-watch-this#818157</link>	
		<description>hincandenza,&lt;br&gt;
Nice, I think that&apos;s exactly what I need.  A little more complex than I was hoping for, but I get to learn something in the process.  I appreciate your extensive answer.  Thank you.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.54327-818157</guid>
		<pubDate>Wed, 03 Jan 2007 21:52:22 -0800</pubDate>
		<dc:creator>JParker</dc:creator>
	</item><item>
		<title>By: hincandenza</title>
		<link>http://ask.metafilter.com/54327/Hey-watch-this#818166</link>	
		<description>No problem- if you have other questions, go ahead and contact me and I can give you tidbits of advice.  The script center has plenty of sample scripts to work from, as well.&lt;br&gt;
&lt;br&gt;
There might well be an app that does this, but a little scripting work not only gives you everything you need, but you learn a ton in the process.&lt;br&gt;
&lt;br&gt;
&lt;small&gt;also, clicking that &quot;mark as a favorite&quot; link would be kind of nice. :)&lt;/small&gt;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.54327-818166</guid>
		<pubDate>Wed, 03 Jan 2007 22:00:43 -0800</pubDate>
		<dc:creator>hincandenza</dc:creator>
	</item><item>
		<title>By: JParker</title>
		<link>http://ask.metafilter.com/54327/Hey-watch-this#818180</link>	
		<description>Done!  I also sent you an email at the address in your profile.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2007:site.54327-818180</guid>
		<pubDate>Wed, 03 Jan 2007 22:24:43 -0800</pubDate>
		<dc:creator>JParker</dc:creator>
	</item>
	</channel>
</rss>
