<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: WinXP Scripting</title>
	<link>http://ask.metafilter.com/9248/WinXP-Scripting/</link>
	<description>Comments on Ask MetaFilter post WinXP Scripting</description>
	<pubDate>Sun, 08 Aug 2004 07:29:45 -0800</pubDate>
	<lastBuildDate>Sun, 08 Aug 2004 07:29:45 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: WinXP Scripting</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting</link>	
		<description>What&apos;s a relatively easy way to script browser actions (events?) in Windows XP? (mi) &lt;br /&gt;&lt;br /&gt; All I want to do is automate the act of going to a website, logging in, and clicking on certain links, resulting in the download of a file that must be saved with a name that I designate in advance. And then, do it again about ten more times, choosing different files to download each time, and of course giving them different names each time. The web interface will always be exactly the same, so will the choices.&lt;br&gt;
&lt;br&gt;
If at all possible, the browser would be the latest IE or Netscape 7.2. &lt;br&gt;
&lt;br&gt;
Note that I am not looking for some kind of slurping tool that will download the whole site, and everything linked to it.</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2004:site.9248</guid>
		<pubDate>Sun, 08 Aug 2004 07:15:58 -0800</pubDate>
		<dc:creator>bingo</dc:creator>
		
			<category>script</category>
		
			<category>browser</category>
		
			<category>windowsxp</category>
		
	</item> <item>
		<title>By: bingo</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173784</link>	
		<description>For clarity: the &quot;certain links&quot; are always exactly the same. The layout of the website is always exactly the same.  Occasionally, there may be one more choice to click on in a list of downloads, but not very often, and that&apos;s all the variety there is. &lt;br&gt;
&lt;br&gt;
The downloaded files are .csv files which will, in a perfect world, be handed off to an excel macro that will bring them to their actual purpose.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173784</guid>
		<pubDate>Sun, 08 Aug 2004 07:29:45 -0800</pubDate>
		<dc:creator>bingo</dc:creator>
	</item><item>
		<title>By: cheaily</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173792</link>	
		<description>I&apos;m assuming that the CSV files are generated by the website?&lt;br&gt;
&lt;br&gt;
I don&apos;t think it&apos;s possible to script that much detail, even using the windows scripting host in XP.&lt;br&gt;
&lt;br&gt;
Depending on how the website is set up (or, even if you have access to modify the site yourself), you might be able to fudge the URL to automatically pick the options you require for each file.&lt;br&gt;
&lt;br&gt;
If you want, email me, or post some more details.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173792</guid>
		<pubDate>Sun, 08 Aug 2004 08:36:04 -0800</pubDate>
		<dc:creator>cheaily</dc:creator>
	</item><item>
		<title>By: jragon</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173793</link>	
		<description>Other than the incredibly expensive WinRunner, I can only think of Mac and Unix solutions.  If you install curl, it can do a lot of this logic, especially if the site is barely changing.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173793</guid>
		<pubDate>Sun, 08 Aug 2004 08:40:20 -0800</pubDate>
		<dc:creator>jragon</dc:creator>
	</item><item>
		<title>By: five fresh fish</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173796</link>	
		<description>I did it with Python, Bingo.  I&apos;m looking to move, so I wanted to scrape the MLS.CA website for new listings every day.  Turns out to be dead easy!  I&apos;m sure the code is inefficient and ugly, but it took less than two hours for me to go from zero to full-speed.  I love Python!&lt;br&gt;
&lt;br&gt;
&lt;pre&gt;from BeautifulSoup import BeautifulSoup&lt;br&gt;
import urllib2, ClientCookie, pickle, webbrowser&lt;br&gt;
&lt;br&gt;
def MakeSoup(pageno):&lt;br&gt;
    #fetch a page and make it into delicious soup&lt;br&gt;
    urlstart = &apos;XXXobscuredXXX&apos;&lt;br&gt;
    urlend = &apos;XXXobscuredXXX&apos;&lt;br&gt;
&lt;br&gt;
    request = urllib2.Request(urlstart+str(pageno)+urlend)&lt;br&gt;
&lt;br&gt;
    request.add_header(&apos;Accept-charset&apos;,&apos;utf-8,*&apos;)&lt;br&gt;
    request.add_header(&apos;Cookie&apos;,&quot;LegalDisclaimer=1&quot;)&lt;br&gt;
&lt;br&gt;
    f = ClientCookie.urlopen(request)&lt;br&gt;
    response = f.read()&lt;br&gt;
    f.close()&lt;br&gt;
&lt;br&gt;
    soup.feed(response)&lt;br&gt;
    return soup&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
def ScrapeNumbers():&lt;br&gt;
    #scrape the numbers out of the soup&lt;br&gt;
    numlist = []&lt;br&gt;
    idlist = soup.fetch(&apos;div&apos;, {&apos;class&apos;: &apos;Label&apos;})&lt;br&gt;
&lt;br&gt;
    for i in idlist:&lt;br&gt;
        s = str(i.contents[0])&lt;br&gt;
        start = s.find(&apos;MLS&apos;)&lt;br&gt;
        end = s.find(&apos;&apos;)&lt;br&gt;
&lt;br&gt;
        if start &amp;gt; 0:&lt;br&gt;
            #got an mlsno&lt;br&gt;
            mlsno = s[start+10:end].strip()&lt;br&gt;
            #get a property id, too&lt;br&gt;
            start = s.find(&apos;PropertyID&apos;)&lt;br&gt;
            end = s.find(&apos;&quot;&amp;gt;MLS&apos;)&lt;br&gt;
            propid = s[start+11:end].strip()&lt;br&gt;
            numlist.append((mlsno,propid))&lt;br&gt;
    return numlist&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
#########################&lt;br&gt;
# load our MLS number history&lt;br&gt;
try:&lt;br&gt;
    mlshistory = pickle.load(open(&apos;mlslist.pickle&apos;))&lt;br&gt;
except:&lt;br&gt;
    mlshistory = []&lt;br&gt;
&lt;br&gt;
# first pass gets the page count&lt;br&gt;
print &quot;Getting page count...&quot;&lt;br&gt;
pageno = 1&lt;br&gt;
soup = BeautifulSoup()&lt;br&gt;
soup = MakeSoup(pageno)&lt;br&gt;
# identify page count&lt;br&gt;
pagelist = soup.first(&apos;span&apos;, {&apos;class&apos;: &apos;PageHeader&apos;})&lt;br&gt;
s = str(pagelist.contents[0])&lt;br&gt;
start = s.find(&apos;of&apos;)&lt;br&gt;
end = s.find(&apos;-&apos;)&lt;br&gt;
pagecount = s[start+2:end].strip()&lt;br&gt;
print &quot;There are &quot;+str(pagecount)+&quot; pages&quot;&lt;br&gt;
&lt;br&gt;
# make more soup&lt;br&gt;
while int(pageno) &lt; int(pagecount):br&gt;
    print &quot;Processing page &quot;+str(pageno)&lt;br&gt;
    pageno += 1&lt;br&gt;
    soup = MakeSoup(pageno)&lt;br&gt;
&lt;br&gt;
# parse out new numbers&lt;br&gt;
newnumbers = []&lt;br&gt;
numlist = ScrapeNumbers()&lt;br&gt;
for i in numlist:&lt;br&gt;
    if i not in mlshistory:&lt;br&gt;
        newnumbers.append(i)&lt;br&gt;
&lt;br&gt;
print &quot;New Numbers:&quot;&lt;br&gt;
print newnumbers&lt;br&gt;
&lt;br&gt;
for i in newnumbers:&lt;br&gt;
    mlshistory.append(i)&lt;br&gt;
    webbrowser.open(&apos;XXXobscuredXXX&apos;+str(i[1]),1)&lt;br&gt;
&lt;br&gt;
#save the updated MLS number history&lt;br&gt;
pickle.dump(mlshistory,open(&apos;mlslist.pickle&apos;,&apos;w&apos;))&lt;br&gt;
&lt;/&gt;&lt;/pre&gt;&lt;/small&gt;&lt;br&gt;
&lt;br&gt;
Dunno why it&apos;s double-spacing.  Makes no real difference.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173796</guid>
		<pubDate>Sun, 08 Aug 2004 09:29:34 -0800</pubDate>
		<dc:creator>five fresh fish</dc:creator>
	</item><item>
		<title>By: bingo</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173802</link>	
		<description>fff: Thanks, but I don&apos;t know enough about coding to read that (uncommented) well enough to use it in my situation. &lt;br&gt;
&lt;br&gt;
cheaily: Playing with the URL is an interesting idea.  &lt;br&gt;
&lt;br&gt;
The csv files are generated by the website, but I don&apos;t need the script to actually do anything with them other than download them. If the handing off to the excel macro has to be done manually and that&apos;s the worst of my problems, then I&apos;ll be fine.&lt;br&gt;
&lt;br&gt;
Here&apos;s what happens when I do it manually: I go to the site, which I have designated as a shortcut in IE. The username/password box comes up immediately, and IE fills it in for me. I click ok, and am presented with a page full of links that offer me choices for what kind of file I want to create. I click on blah, I click on blah, and wallah, the information is displayed before me. To show that I want to download it as a csv file, I click on blah. Windows asks me if I want to open the file, or save it. I say that I want to save it. Windows asks me what name to save it under. I tell it. The deed is done. I then go back to the website and choose some more options, and do it again.&lt;br&gt;
&lt;br&gt;
Surely, to a browser, these links all have numbers, or some other labels, that can be remembered and used to find the same links every time? I imagine (in my non-programmer mind) code that does something like this (This is me talking to the browser):&lt;br&gt;
&lt;br&gt;
a) go to the main page&lt;br&gt;
b) enter username and password, click ok&lt;br&gt;
c) You will see 20 links. Click on link #14.&lt;br&gt;
d) You will see 5 links. Click on link #2.&lt;br&gt;
e) You will see 7 links and three buttons. Click on button #1.&lt;br&gt;
f) You will get a choice of whether to open the file, or save it. Choose save.&lt;br&gt;
g) You will be asked what to name it and where to put it. Call it &quot;bingo&apos;s file #8&quot; and put it in the folder called &quot;bingo&apos;s automatically downloaded csv documents.&quot;&lt;br&gt;
h) Return to step c), but this time start with link #15 instead of #14.&lt;br&gt;
i) Repeat until you&apos;ve gone through the cycle starting with links 14, 15, 16, 17, and 18.&lt;br&gt;
&lt;br&gt;
Then, in a perfect world, the excel macro will spring into action without a human having to be there to start it. But just steps a through i would make my life a lot easier.&lt;br&gt;
&lt;br&gt;
Thanks.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173802</guid>
		<pubDate>Sun, 08 Aug 2004 10:06:47 -0800</pubDate>
		<dc:creator>bingo</dc:creator>
	</item><item>
		<title>By: nicwolff</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173851</link>	
		<description>Perl, &lt;a href=&quot;http://search.cpan.org/~leira/HTTP-Recorder-0.02/lib/HTTP/Recorder.pm&quot;&gt;HTTP::Recorder&lt;/a&gt;, and &lt;a href=&quot;http://search.cpan.org/~petdance/WWW-Mechanize-1.02/lib/WWW/Mechanize.pm&quot;&gt;WWW::Mechanize&lt;/a&gt; will do this quite easily and nicely.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173851</guid>
		<pubDate>Sun, 08 Aug 2004 14:40:32 -0800</pubDate>
		<dc:creator>nicwolff</dc:creator>
	</item><item>
		<title>By: nicwolff</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173853</link>	
		<description>Or since you&apos;re on Windows I should have linked &lt;a href=&quot;http://www.activestate.com/Products/ActivePerl/&quot;&gt;Perl&lt;/a&gt;.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173853</guid>
		<pubDate>Sun, 08 Aug 2004 14:42:55 -0800</pubDate>
		<dc:creator>nicwolff</dc:creator>
	</item><item>
		<title>By: holloway</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173873</link>	
		<description>It&apos;s not using IE but &lt;a href=&quot;http://webtest.canoo.com/webtest/&quot;&gt;Canoo&lt;/a&gt; takes an XML file of events (go here, click that, download) and plays that back.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173873</guid>
		<pubDate>Sun, 08 Aug 2004 15:37:03 -0800</pubDate>
		<dc:creator>holloway</dc:creator>
	</item><item>
		<title>By: bingo</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173881</link>	
		<description>As far as Canoo, the website won&apos;t cooperate with anything that doesn&apos;t identify itself as a recent version of IE.  &lt;br&gt;
&lt;br&gt;
nicwolff: Thanks, I guess you&apos;ve shown me that for someone at my level (programming novice; all I know is some VB and bash shell stuff), this is going to take a lot of reading on my part, even if the result would be simple to acheive for someone who knows enough.&lt;br&gt;
&lt;br&gt;
But surely then, this is a gap in the market waiting to be filled? Surely there are others like me, wanting to automate their browsing experience at a fairly simple level, but not knowing python or perl (and, in general, not needing them for my job)?</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173881</guid>
		<pubDate>Sun, 08 Aug 2004 16:24:34 -0800</pubDate>
		<dc:creator>bingo</dc:creator>
	</item><item>
		<title>By: holloway</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#173932</link>	
		<description>Say what? Canoo&apos;s website works for me in Firefox 0.92 / WinXP -- and as you can see from Google, canoo is very popular.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-173932</guid>
		<pubDate>Sun, 08 Aug 2004 20:51:42 -0800</pubDate>
		<dc:creator>holloway</dc:creator>
	</item><item>
		<title>By: bingo</title>
		<link>http://ask.metafilter.com/9248/WinXP-Scripting#174336</link>	
		<description>holloway, I&apos;m not talking about the website, I&apos;m talking about the product. I have to use either IE or Netscape for the task.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2004:site.9248-174336</guid>
		<pubDate>Mon, 09 Aug 2004 17:23:33 -0800</pubDate>
		<dc:creator>bingo</dc:creator>
	</item>
	</channel>
</rss>
