How can I load (and sort) a really huge text file in Perl?
October 8, 2008 10:06 PM
Subscribe
How do I load and then sort a really huge (94 MB) text file list of words? Perl runs out of memory trying to load all the lines-- before I have a chance to try to sort it.
My text file contains a list of words, one per line. I'd like to sort this list into alphabetical order (and then ultimately traverse the sorted list to easily parse out all of the non-unique tokens). Here is what I was doing in Perl:
open LARGELIST, "large_source_file.txt" or die $!;
print "Loading file...\n";
@lines = [LARGELIST];
print "Done";
@sorted = sort(@lines);
(The brackets around LARGELIST are supposed to be the less-than and greater-than signs, but that would get stripped out in AskMeFi for looking like an HTML tag.)
After chugging for a while, perl runs out of memory while at the loading-file stage. I'm definitely not a Perl guru, so is there a more memory-efficient way I should do this (or is there some entirely different approach/language I should be trying)?
posted by kosmonaut to computers & internet (25 comments total)
4 users marked this as a favorite
posted by Class Goat at 10:20 PM on October 8, 2008