CSV editor capable of editing giant files
December 18, 2008 7:43 AM   Subscribe

I need a freeware app that can efficiently view and edit massive CSV files. I have a couple of 6K column csv files I need to look at and edit.

I've tried CSVed but it just goes 100% CPU and stops repainting the screen. It seems like it's partial loading is only by rows being viewed and not columns so it just can't load my files.

I can't make sense of the files in a straight text editor because the columns are not aligned.
posted by srboisvert to Computers & Internet (12 answers total) 3 users marked this as a favorite
 
Response by poster: I forgot - Windows please.
posted by srboisvert at 7:46 AM on December 18, 2008


Have you tried a freeware spreadsheet editor? I would hope 6K columns is not a problem for Symphony. You can download it here.
posted by ubiquity at 7:55 AM on December 18, 2008


I've used OpenOffice to view very large csv files (800mg) without problems.
posted by Cat Pie Hurts at 8:00 AM on December 18, 2008


Yeah, OpenOffice is your best bet - CSV files are generally pretty lightweight to open because they don't have any calculations in them, they're generally database dumps of some kind.
posted by Happy Dave at 8:23 AM on December 18, 2008


Yes, spreadsheets.

You can upload CSV files to Google Documents*, which doesn't require an install.

* I'm sure there are other online spreadsheets too. Just tossing GD out as an example.
posted by unixrat at 8:28 AM on December 18, 2008


ooops, i apologize for my over-optimism. Symphony has the same limitation as Excel -- it won't read in more than 256 columns. :-(
posted by ubiquity at 8:50 AM on December 18, 2008


My job sometimes requires similar manipulations and I almost always resort to Perl. Definitely the fastest if you know what you are doing. Perl can easily parse CSV files and manipulate with almost limitless power. (Feel free to post back here/email me if I can help with it)

I'm fairly sure that OpenOffice has a 1024 column limit, not sure about gDocs.

Depending on what you need to do with the files, I think some text editors have a feature to line up by columns. I was unable to find the feature in Notepad++ (my fav text/code editor) but I'm pretty sure its there. Other options may also have this feature. Check out PSPad in particular.

(note: I'm sorry I don't have time to provide more info or links. Post back here if you have trouble finding anything)
posted by coolin86 at 9:02 AM on December 18, 2008 [1 favorite]


R reads in CSV, is free, and the 32-bit version can address up to 4 GB of space and assign it to an object with fewer than 2*109 elements.

You can use either read.table or scan to read in the data — scan will work better for large files, in general, but you need to do more work to specify the column names, so read.table is probably better for your needs.

Type ?read.table or ?scan to read the help descriptions and examples for these commands. For CSV, your separator is a comma (",") and your quote character is quotation mark, which needs to be escaped ("\"").

If you need to edit the file and don't like entering text commands, try using a GUI wrapper like JGR.
posted by Blazecock Pileon at 9:54 AM on December 18, 2008


seconding the Perl/programming route. Is there any reason you need to see the data all at once like that?

You might also be able to figure out how to break the file apart into groups of 256 columns, and just load it as 24 or so sheets in excel.
posted by NormandyJack at 9:57 AM on December 18, 2008


Excel 2007 can copy with 16,384 columns by 1,048,576 rows
posted by JonB at 11:16 AM on December 18, 2008


Response by poster: My job sometimes requires similar manipulations and I almost always resort to Perl. Definitely the fastest if you know what you are doing. Perl can easily parse CSV files and manipulate with almost limitless power. (Feel free to post back here/email me if I can help with it)

Part of the problem is that there are some malformed cells (commas in data) and I need to be able to eyeball them so I can fix them and then use the file in code.

I've actually resorted to transposing the array using Ruby's Array Class's transpose() so I can load it in excel and then edit it to fix the bad cells and then used transpose again to restore it.
posted by srboisvert at 12:27 PM on December 18, 2008


Interesting plan. I didn't think of doing it that way. In the cases where I have commas in the data I dump lines with more than the expected number of columns to a separate file and then process that file more carefully. But I've never had to deal with 6000 columns. :)

Good luck.
posted by coolin86 at 8:41 AM on December 19, 2008


« Older Fun And Games With Energy Conservation   |   How to I get my Outlook contacts under control? Newer »
This thread is closed to new comments.