Help with curl
February 15, 2008 3:22 PM   Subscribe

Curl response different than browser

This was working great until this afternoon, not sure what to do.
When I use this code:

http://grady.us/apps/liz/test.txt

http://grady.us/apps/liz/test

The data it prints out is:

HTTP/1.1 200 OK Date: Fri, 15 Feb 2008 23:09:04 GMT Server: IBM_HTTP_Server Surrogate-Control: content="ESI/1.0" Content-Length: 9792 Set-Cookie: JSESSIONID=0000eh4Igf-Mqg5goa-lDy0GuRd:12qjgbb49; Path=/ Set-Cookie: Registration=currentUserId:; Expires=Wed, 13 Feb 2013 23:09:48 GMT; Path=/; Domain=cars.com Set-Cookie: affiliate=national; Expires=Fri, 07 Mar 2008 23:09:48 GMT; Path=/; Domain=cars.com Set-Cookie: zipcode=34205; Expires=Wed, 13 Feb 2013 23:09:48 GMT; Path=/; Domain=cars.com Set-Cookie: SessionInfo=mkid%3D34%7Cmknm%3DMitsubishi%7Cmdid%3D314%7Cmdnm%3DMontero%7C; Expires=Wed, 13 Feb 2013 23:09:48 GMT; Path=/; Domain=cars.com Content-Type: text/html Content-Language: en-US Set-Cookie: cars_persist=3745585324.20480.0000; expires=Fri, 15-Feb-2008 23:38:31 GMT; path=/ Vary: Accept-Encoding
cars.com logo
Looking for something on cars.com?

Sorry, the page you requested is not available. Some linked or bookmarked pages may have moved. If you are not redirected to the cars.com homepage in a few seconds, please click here.[/quote]

It is able to set cookies and the cookies it sets:

# Netscape HTTP Cookie File
# http://www.netscape.com/newsref/std/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.

www.cars.com FALSE / FALSE 0 JSESSIONID 0000eh4Igf-Mqg5goa-lDy0GuRd:12qjgbb49
.cars.com TRUE / FALSE 1360796988 Registration currentUserId:
.cars.com TRUE / FALSE 1204931388 affiliate national
.cars.com TRUE / FALSE 1360796988 zipcode 34205
.cars.com TRUE / FALSE 1360796988 SessionInfo mkid%3D34%7Cmknm%3DMitsubishi%7Cmdid%3D314%7Cmdnm%3DMontero%7C
www.cars.com FALSE / FALSE 1203118711 cars_persist 3745585324.20480.0000

If URL is set to simply http://www.cars.com/go/index.jsp, their home page, it works.
Help?
posted by jesirose to Computers & Internet (7 answers total)
 
Use Wireshark/tcpdump to snarf the outgoing requests to see if they are in fact identical.
posted by rhizome at 3:39 PM on February 15, 2008


Response by poster: Figures the client calls me as soon as I post this. I will have to google those and figure out what to do, this is really the first time I used Curl.
Thanks!
posted by jesirose at 3:56 PM on February 15, 2008


Best answer: An examination of the response from cars.com to your request indicates that cars.com is returning the contents out of order (some sort of buffering bug). Fiddling with the request a bit, it appears that it is a language handling bug in the cars.com web server / web application. If I modify your sample code to include an Accept-Langage header...
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Accept-Language: en"));
... I get a valid response from cars.com.

By the way, spoofing the HTTP user agent to appear as if your request originates from Firefox is unnecessary and rather unfriendly. If you're going to be generating automated queries to their site you should include a user agent that is unique to your application (possibly even including a URL or e-mail address at which they can contact you if your application places undue load on their servers). For example, instead of...
$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";
...I'd recommend you use something like...
$useragent="CarsProgram/1.0 (+http://grady.us/apps/liz/bot.html)"
And then put some explanatory text and contact information at http://grady.us/apps/liz/bot.html.
posted by RichardP at 5:00 PM on February 15, 2008


Response by poster: Thank you so much Richard P
posted by jesirose at 6:52 PM on February 15, 2008


Response by poster: The reason I had set it to Firefox was that was the code I got off an example on how to accept cookies. Will the custom user agent still be able to accept cookies if I need to use this for other projects?
Thanks a ton, you saved my butt :)
posted by jesirose at 6:55 PM on February 15, 2008


Will the custom user agent still be able to accept cookies if I need to use this for other projects?

Sure. The user agent identifies the client to the server, it shouldn't interact with cookie handling at all. The only case I can think of where it would matter is if the server did some really odd server side processing involving browser sniffing and cookies (i.e. set cookie A if client identifies itself as IE, but set cookie B if client identifies itself as Firefox). Browser sniffing isn't unheard of, but I have never seen it used to determine which cookies will be by the server.
posted by RichardP at 7:23 PM on February 15, 2008


Response by poster: Right, I have seen that too - good call.
Thanks again and have a great evening!
posted by jesirose at 7:33 PM on February 15, 2008


« Older Debt/collection agency help: I want to pay off my...   |   ISO Satan's fiddler. Newer »
This thread is closed to new comments.