Automating form interactions on a website?
February 5, 2005 3:30 PM Subscribe
Web browser macros. I want to automate my interaction with a web site. [+]
The type of web browser macro setups I find with google seem to be focused on things like checking email and logging into various websites. My needs are different, in that a) my focus is not on getting data from the website, but on entering data in the site's forms, clicking its buttons, selecting or unselecting checkboxes, etc. and b) I do not want to enter the same data every time. Rather, I want my interaction with the website to be determined by an offline file (probably Excel).
I'm using Windows XP, and could use IE or Firefox for this.
Answers that question the assumptions, as well as other answers that do not directly answer the question but are relevant anyway, are welcome.
The type of web browser macro setups I find with google seem to be focused on things like checking email and logging into various websites. My needs are different, in that a) my focus is not on getting data from the website, but on entering data in the site's forms, clicking its buttons, selecting or unselecting checkboxes, etc. and b) I do not want to enter the same data every time. Rather, I want my interaction with the website to be determined by an offline file (probably Excel).
I'm using Windows XP, and could use IE or Firefox for this.
Answers that question the assumptions, as well as other answers that do not directly answer the question but are relevant anyway, are welcome.
Are you simply trying to get data into a website or are you trying to automate the testing of a web interface?
If it's the former, I would bypass the web form entirely, deconstruct the key/value data that's being submitted to the server, and write a small script that automates this (see: curl, wget.)
If it's the latter, there's tons of resources here.
posted by Loser at 3:56 PM on February 5, 2005
If it's the former, I would bypass the web form entirely, deconstruct the key/value data that's being submitted to the server, and write a small script that automates this (see: curl, wget.)
If it's the latter, there's tons of resources here.
posted by Loser at 3:56 PM on February 5, 2005
Response by poster: Are you simply trying to get data into a website or are you trying to automate the testing of a web interface?
I'm trying to get some data into a website.
I'm open to suggestions about good places to learn about curl or Mechanize. This and this are a bit too complex for me. I can learn, of course, but I'd like to start with different sources.
posted by bingo at 4:04 PM on February 5, 2005
I'm trying to get some data into a website.
I'm open to suggestions about good places to learn about curl or Mechanize. This and this are a bit too complex for me. I can learn, of course, but I'd like to start with different sources.
posted by bingo at 4:04 PM on February 5, 2005
If you install Mechanize, and you know a bit of perl, just get it to spit out the script and then hack the script as you need to.
If you don't know perl... not so good.
posted by orthogonality at 4:12 PM on February 5, 2005
If you don't know perl... not so good.
posted by orthogonality at 4:12 PM on February 5, 2005
I was able to do this with AutoIt... for a website that needed me to click a link a certain number of times.
posted by TuxHeDoh at 5:17 PM on February 5, 2005
posted by TuxHeDoh at 5:17 PM on February 5, 2005
Response by poster: I promise that it has nothing to do with spam.
posted by bingo at 8:53 PM on February 5, 2005
posted by bingo at 8:53 PM on February 5, 2005
if it's a single page and a lot of data (many fields, many page submissions) it would probably make sense to write a chunk of code that submits the data directly without using a browser at all. you'd look at the page source, note the field names, write code to construct/send a HTTP POST message, and then loop over the data you want to enter, doing a POST for each page-worth of data. help for doing this with python, but most any language will do.
posted by andrew cooke at 3:59 AM on February 6, 2005
posted by andrew cooke at 3:59 AM on February 6, 2005
Response by poster: What I'm concerned about is that it's not just fields to fill in, it's widgets (radio buttons, check boxes, and some unique to the site).
Also, the url does not change when some widgets are selected. I don't know if it's javascript or what.
What kind of client do you need to send an HTTP POST message?
posted by bingo at 7:38 AM on February 6, 2005
Also, the url does not change when some widgets are selected. I don't know if it's javascript or what.
What kind of client do you need to send an HTTP POST message?
posted by bingo at 7:38 AM on February 6, 2005
radio buttons and check boxes are just different ways of getting information to the user. the results are sent to the server in the same way as any other input (as name/value pairs). that's assuming it's HTML. you can find HTML and HTTP info at W3C.
"unique to the site" may mean they're using flash or similar, in which case this won't work.
an HTTP POST message is just some text, sent to the server, following the format described for HTTP. most languages have libraries that let you send information to a socket, which is all that is necessary here.
but, to be honest, i'm wishing i hadn't said anything, as i'm probably leading you down an unproductive path - from your question i don't think this is going to be a suitable approach. you are probably better looking at other people's links to see if there's something that can learn what you do with the cursor in some way.
posted by andrew cooke at 8:43 AM on February 6, 2005
"unique to the site" may mean they're using flash or similar, in which case this won't work.
an HTTP POST message is just some text, sent to the server, following the format described for HTTP. most languages have libraries that let you send information to a socket, which is all that is necessary here.
but, to be honest, i'm wishing i hadn't said anything, as i'm probably leading you down an unproductive path - from your question i don't think this is going to be a suitable approach. you are probably better looking at other people's links to see if there's something that can learn what you do with the cursor in some way.
posted by andrew cooke at 8:43 AM on February 6, 2005
Just in addition to all of the other programmer talk here, if you decide to go the route of writing a script (I did something very similar to what you're describing, but in Perl) I recommend the Web Developer extension for Firefox -- it's a great tool to learn the layout and posting content of a webpage.
posted by ChrisR at 9:09 AM on February 6, 2005
posted by ChrisR at 9:09 AM on February 6, 2005
Response by poster: "unique to the site" may mean they're using flash or similar, in which case this won't work.
What would work, in that case? I think it might be javascript (is there a way to find out?)
posted by bingo at 10:55 AM on February 6, 2005
What would work, in that case? I think it might be javascript (is there a way to find out?)
posted by bingo at 10:55 AM on February 6, 2005
that depends on how flash gets its input from the web browser. i doubt javascript would work, since that is used generally to manipulate documents constructed by HTML, and Flash is just a "black box" from that point of view. but i may be wrong, particularly in the case of mozilla/firefox - you'd need to know the structure of the software to be certain.
more likely, your solution (if it is flash) is going to come from something that works at the level of the windowing system, recording the keyclicks and mouse movements you make while you enter data, and then reproducing them, with some kind of modification. maybe some kind of gui test framework, but i suspect there's a more user-friendly version on the market (apple used to have something like this, aeons ago, didn't it?).
or you could snoop on the data flow from the flash app back to the server (assuming it's not encrypted) and reverse engineer/emulate that. but again, i suspect that's going to involve a lot of learning on your part.
have you looked at the other links? i haven't. stavros's roboform looks like it might be what you want.
and just how much data is there? is it really more than a (tedious) day of typing? don't you have any interns/grad students/apprentices handy?
posted by andrew cooke at 11:30 AM on February 6, 2005
more likely, your solution (if it is flash) is going to come from something that works at the level of the windowing system, recording the keyclicks and mouse movements you make while you enter data, and then reproducing them, with some kind of modification. maybe some kind of gui test framework, but i suspect there's a more user-friendly version on the market (apple used to have something like this, aeons ago, didn't it?).
or you could snoop on the data flow from the flash app back to the server (assuming it's not encrypted) and reverse engineer/emulate that. but again, i suspect that's going to involve a lot of learning on your part.
have you looked at the other links? i haven't. stavros's roboform looks like it might be what you want.
and just how much data is there? is it really more than a (tedious) day of typing? don't you have any interns/grad students/apprentices handy?
posted by andrew cooke at 11:30 AM on February 6, 2005
Response by poster: more likely, your solution (if it is flash) is going to come from something that works at the level of the windowing system, recording the keyclicks and mouse movements you make while you enter data, and then reproducing them, with some kind of modification.
Yes, I think something like that would be ideal.
or you could snoop on the data flow from the flash app back to the server (assuming it's not encrypted) and reverse engineer/emulate that. but again, i suspect that's going to involve a lot of learning on your part.
That would be okay if I had a solid idea of where to start. What sort of mechanism would be used to do that?
stavros's roboform looks like it might be what you want.
Well, I looked at it a little. It calls itself a password manager. It may be able to do what I want; I'll have to spend a lot more time on the site reading the docs and the FAQ etc. But I'm concerned about anything that calls itself a password manager, because the data is going to be different each time, and the combination of widget clicks etc. is going to be different some of the time.
and just how much data is there? is it really more than a (tedious) day of typing?
There is a ton of data, and it's ongoing. It's an indefinite number of tedious days of typing.
Basically, I get the data, and then I have to enter it in the web site. Exactly where on the site I enter it, how much I enter, and which buttons I click etc. depend on the data.
posted by bingo at 8:14 PM on February 6, 2005
Yes, I think something like that would be ideal.
or you could snoop on the data flow from the flash app back to the server (assuming it's not encrypted) and reverse engineer/emulate that. but again, i suspect that's going to involve a lot of learning on your part.
That would be okay if I had a solid idea of where to start. What sort of mechanism would be used to do that?
stavros's roboform looks like it might be what you want.
Well, I looked at it a little. It calls itself a password manager. It may be able to do what I want; I'll have to spend a lot more time on the site reading the docs and the FAQ etc. But I'm concerned about anything that calls itself a password manager, because the data is going to be different each time, and the combination of widget clicks etc. is going to be different some of the time.
and just how much data is there? is it really more than a (tedious) day of typing?
There is a ton of data, and it's ongoing. It's an indefinite number of tedious days of typing.
Basically, I get the data, and then I have to enter it in the web site. Exactly where on the site I enter it, how much I enter, and which buttons I click etc. depend on the data.
posted by bingo at 8:14 PM on February 6, 2005
« Older How can I get French citizenship as a current US... | What advice do you have for scooter shopping? Newer »
This thread is closed to new comments.
posted by orthogonality at 3:54 PM on February 5, 2005