Running into Ruby Restriction
March 9, 2009 4:06 AM
Subscribe
In Ruby, how can you get around the ~65,500 character limit when grabbing a web page?
I'm new to Ruby, but have started to use it to scrape information from websites. I have been using either the Hpricot package or net:http. Unfortunately, when I've tried using these to scrape larger web pages, the streams cut off after 65,500 characters. I haven't found any information online about this. Is there a way to get around this limit? Can you separate the stream over two arrays or strings? Or will I have to manage the stream myself with new code?
posted by FuManchu to computers & internet (7 comments total)
2 users marked this as a favorite
I would try to process the screen scraped string as it comes in though. In Perl: while (
posted by devnull at 5:36 AM on March 9