Help me deal with a huge text file!
September 1, 2011 6:55 PM   Subscribe

On a Win7 desktop, how can I open and read a 1.61 GB CSV file containing text (kind of an archive of e-mails) without freezing up the system?
posted by vidur to Computers & Internet (12 answers total)
 
Me, I'd install cygwin (a unix like environment) and install the text processing utilities; then I'd use the split command to carve it up into a bunch of 100 MB files (assuming that's workable in your file reader). You can control the filenames it spits out if you need to.

Looks like there's some unix utility ports to win32 here: http://unxutils.sourceforge.net/ including split and grep.
posted by jenkinsEar at 7:00 PM on September 1, 2011 [1 favorite]


Try TextPad.
posted by mikeand1 at 7:11 PM on September 1, 2011


Best answer: I've used vim for that before.
posted by hackwolf at 7:19 PM on September 1, 2011


Best answer: Seconding that vim will handle it, even on win7
posted by Perplexity at 7:22 PM on September 1, 2011


In my experience, Notepad++ and NoteTab both handle large files well; I don't think I've opened anything a gig or bigger, but it might be more responsive than what you're using now.
posted by AzraelBrown at 7:27 PM on September 1, 2011


Seconding TextPad: it should work (albeit slowly), it'll be easier to work with than multiple files, and it doesn't have the learning curve of vim.
posted by orangejenny at 7:29 PM on September 1, 2011


Response by poster: Notepad++ refused to even open the file. I've just installed vim (have used it on other systems before, so I should have thought of it myself), and it seems to be working haltingly - but that's okay as far as I am concerned. I just need to read some of the stuff, no need to process the text. Thanks!
posted by vidur at 7:50 PM on September 1, 2011


I don't know what software you have at your disposal but I would import the whole file into a database and query against that if I needed to do something like this for work. I'm guessing your CSV file has dates, email addresses, message body, etc all as separate fields.
posted by gatsby died at 4:58 AM on September 2, 2011


You want an editor that will memory-map the file instead of trying to read it into memory. The file will open nearly instantly because it will only be paged in as needed, i.e. the only parts read are the parts that are displayed on the screen; the application is usually smart enough when used in this mode to disable features like counting the total number of lines that require reading the whole file. Because of this you can edit files larger than the amount of physical memory -- the only limit is the virtual address space which for 32 bit apps will be 2GB (standard config), 3GB (program linked with "large address aware" option and /3GB boot option used), or 4GB (32 bit app running on 64 bit operating system, linked with "large address aware" option), or 8TB for a 64 bit app.

UltraEdit is an example of a Windows editor with this feature.
posted by Rhomboid at 11:20 AM on September 2, 2011


Seconding gatsby died, I've dealt with 1GB text files by importing them into Access, assuming it's structured enough to be mapped onto a database structure. Depending on what you want to do with the file, it's probably the easiest way to deal with it later, if you want to do anything more than just edit a few lines.
posted by Boobus Tuber at 12:44 PM on September 2, 2011


Response by poster: Thanks. I've got OpenOffice. Will try importing it into Base and see where that takes me.

vim has already done most of what I wanted to do with it. I don't really need to process the text, and this isn't for work. But if importing to database makes it easier to sort/read the messages, that's be great.
posted by vidur at 2:42 PM on September 2, 2011


I use LTF (large text file) Viewer for this. Not an editor, just a viewer / searcher. It will load the file instantly. Scrolling is also instant, but searching is slow.
posted by smackfu at 8:19 AM on September 7, 2011


« Older Recording the dog.   |   Do pagers work in the ocean? Newer »
This thread is closed to new comments.