I've got the compression blues
January 2, 2008 4:30 PM Subscribe
What kind of archive program should I be using to ZIP... well, not necessarily ZIP but compress a backup of a hard drive? InfoZIP and WinRAR are failing me.
The stuff I want to zip is 52,000 files totalling 16.9 GB uncompressed.
I tried InfoZip 2.32, but as the archive grows to about 2 GB I start getting random "input file" errors.
I am trying WinRAR to build a .rar file, but it is abysmally slow... it has taken 2 hours to compress just half of this data.
Any other ideas? I'm contemplating GZip, but I don't know if it supports saving path structures (and for some reason people always use tar+gz which is a mess). Robustness is also important; I don't want one small CRC error in something this big to corrupt the whole archive.
The stuff I want to zip is 52,000 files totalling 16.9 GB uncompressed.
I tried InfoZip 2.32, but as the archive grows to about 2 GB I start getting random "input file" errors.
I am trying WinRAR to build a .rar file, but it is abysmally slow... it has taken 2 hours to compress just half of this data.
Any other ideas? I'm contemplating GZip, but I don't know if it supports saving path structures (and for some reason people always use tar+gz which is a mess). Robustness is also important; I don't want one small CRC error in something this big to corrupt the whole archive.
Are you putting them all into one ZIP file and is your drive FAT32?
FAT32 doesn't like files >4 gigs. Just a warning so you don't run into problems later.
posted by sharkfu at 4:53 PM on January 2, 2008
FAT32 doesn't like files >4 gigs. Just a warning so you don't run into problems later.
posted by sharkfu at 4:53 PM on January 2, 2008
Use WinRAR and split it into 100mb chunks? I've done files up to 25GB this way with no issue.
posted by mphuie at 4:57 PM on January 2, 2008
posted by mphuie at 4:57 PM on January 2, 2008
What jockc said. Unless you're a writer or some other kind of person who generates lots and lots of text files, most of what you're going to be zipping up is going to be uncompressable binary files. Use 7-zip without compression (zip + "store" method). After this, if you want you can then try to compress the resulting file to see if you get any meaningful space savings.
posted by rhizome at 5:01 PM on January 2, 2008
posted by rhizome at 5:01 PM on January 2, 2008
i'd use winrar, 10-49 mb parts, and par files for something like this -- not sure I'd trust an archive that large to not get corrupted.
the other upside is it'll be easier to span optical media if you want an extra backup option.
posted by fishfucker at 5:27 PM on January 2, 2008
the other upside is it'll be easier to span optical media if you want an extra backup option.
posted by fishfucker at 5:27 PM on January 2, 2008
If you don't mind paying, Nrton Ghost is fabulous for compressing complete system backups.
posted by Oktober at 5:40 PM on January 2, 2008
posted by Oktober at 5:40 PM on January 2, 2008
I use Acronis True Image Home for all my backups. It's wonderful but expensive. I have successfully used WinRAR to backup about a 70gb file archive into individual chunks based on folder names resulting in about 800 rar files.
I'm with the others, turn off compression. 16.9 gb isn't a lot to store or transport and with WinRAR you can slice it up into easily manageable sizes.
posted by wfrgms at 5:46 PM on January 2, 2008
I'm with the others, turn off compression. 16.9 gb isn't a lot to store or transport and with WinRAR you can slice it up into easily manageable sizes.
posted by wfrgms at 5:46 PM on January 2, 2008
Well, tar+gz is the way to do it, at least on the *nix and Mac side of the fence. But echoing 7-Zip, for Windows. As has been stated, you can use it without compression, and test compression out in various formats to see what sort of space saving you can get. 7-Zip will also split the archive into chunks of either arbitrary or common (FD, CD, DVD) sizes.
posted by eafarris at 6:00 PM on January 2, 2008
posted by eafarris at 6:00 PM on January 2, 2008
it is abysmally slow... it has taken 2 hours to compress just half of this data
When you consider that you're compressing 16 gigs of data, 2 hours isn't that long.
Another thing to consider is what type of files you're putting into your archive. If the vast majority of the size is taken up by files like MP3s, JPGs, or DIVX videos, you're wasting your time. Those file formats are already heavily compressed, and you're not going to see much size reduction by zipping them.
Since time is usually more valuable than space, just create a tar archive (or zip them using no compression).
posted by chrisamiller at 6:23 PM on January 2, 2008
When you consider that you're compressing 16 gigs of data, 2 hours isn't that long.
Another thing to consider is what type of files you're putting into your archive. If the vast majority of the size is taken up by files like MP3s, JPGs, or DIVX videos, you're wasting your time. Those file formats are already heavily compressed, and you're not going to see much size reduction by zipping them.
Since time is usually more valuable than space, just create a tar archive (or zip them using no compression).
posted by chrisamiller at 6:23 PM on January 2, 2008
If you think WinRAR is slow, you should see WinRK (which I won't bother to link to because it won't compress more than 2Gig worth of source files).
Anyway, nthing 7-zip or WinRAR. And nthing breaking it into volumes and using all the recovery/redundancy options available.
(If your backup set includes a lot of JPEGs (say, at least 4Gig worth), I recommend purchasing Stuffit. Its JPEG compression is significantly better than anything WinRAR, 7-zip or WinRK can achieve. This is only useful, however, if the backup your compressing is still a normal set of directories and files. If it's already turned into a single file then Stuffit won't help.)
posted by krisjohn at 7:02 PM on January 2, 2008
Anyway, nthing 7-zip or WinRAR. And nthing breaking it into volumes and using all the recovery/redundancy options available.
(If your backup set includes a lot of JPEGs (say, at least 4Gig worth), I recommend purchasing Stuffit. Its JPEG compression is significantly better than anything WinRAR, 7-zip or WinRK can achieve. This is only useful, however, if the backup your compressing is still a normal set of directories and files. If it's already turned into a single file then Stuffit won't help.)
posted by krisjohn at 7:02 PM on January 2, 2008
Response by poster: Replying -- Thanks everyone... actually what I am backing up is non-application stuff, like documents, graphics (like Illustrator files), programming projects, etc. I keep the data and applications as separate as possible. WinRAR is reporting a 40% compression rate, which is pretty good. Also I'm using NTFS on all my systems.
Yeah, I guess RAR is the way to go, though I'm debating whether the 8 GB of extra space is worth it. I would consider Ghost or True Image but I am paranoid about relying on a payware app.
posted by chips ahoy at 7:08 PM on January 2, 2008
Yeah, I guess RAR is the way to go, though I'm debating whether the 8 GB of extra space is worth it. I would consider Ghost or True Image but I am paranoid about relying on a payware app.
posted by chips ahoy at 7:08 PM on January 2, 2008
I use Retrospect for backups, but I don't use compression.
One thing that might be causing your "file input" errors is very long filenames. If you accidentally move or rename a folder in such a way that the full file path becomes longer than a certain length, most applications won't be able to read or write to the file. Most of the errors I get when I'm doing backups have to do with either the long filenames or file access errors.
posted by burnmp3s at 7:48 PM on January 2, 2008
One thing that might be causing your "file input" errors is very long filenames. If you accidentally move or rename a folder in such a way that the full file path becomes longer than a certain length, most applications won't be able to read or write to the file. Most of the errors I get when I'm doing backups have to do with either the long filenames or file access errors.
posted by burnmp3s at 7:48 PM on January 2, 2008
Ghost is fantastic if you don't mind the extra space (and the extra time, it's pretty quick). If you absolutely must compress it all, RAR is definitely the way to go. It tends to get pretty good compression rates, and, from what I can tell, and from what other reviewers have posted (and it's too late and I'm too lazy to find the actual sources), RAR is also among the fastest compression engines in the game. Just make sure to split it all into smaller pieces.
posted by General Malaise at 9:30 PM on January 2, 2008
posted by General Malaise at 9:30 PM on January 2, 2008
I vote 7-zip. It's free, uses dualcore CPU power a bit better than WinRAR, typically compresses better too. Since you're compressing documents as well as graphics you can save some time by NOT using the maximum compression mode - just use normal or fast. It won't try to compress your graphics since they're already compressed, and will still crunch your documents pretty well. And if you don't like the basic 7-zip interface, try JZip - a nicer gui for 7-zip for a few extra bells and whistles. Freeware too.
posted by tra at 10:05 PM on January 2, 2008
posted by tra at 10:05 PM on January 2, 2008
Another vote for WinRAR +PAR! You'll definately, definately, definately, definately want PAR files.
The basic concept, in case you don't know already is that you have your archive broken into chunks, (FILENAME.RAR, FILENAME.R00, FILENAME.R01, etc.) and you have separate PAR files (FILENAME.PAR, FILENAME.P00, etc.) You can make as many PAR files as you like up to the number of total chunks. The beauty of PAR files is that, should ANY chunk get corrupted, you can rebuild it with ANY PAR file. So, if file #10 of 20 gets fucked up, and you've got a single PAR file, you can rebuild part 10.
How many PAR files you create depends on how much reliability you want to have. You can have one, or two, or five... the thing is, you can only use one PAR file to replace one chunk. So, if part #10 of 20 is busted, and part #17 of 20 is busted, you can't use the same PAR file to rebuild them both. You'd need two PAR files.
Anyway, RAR+PAR. It's the standard. "What standard!? There's no standard!" I hear you scream. The USENET standard, that's what. And as goes USENET, so goes the world.
posted by Civil_Disobedient at 3:08 AM on January 3, 2008
The basic concept, in case you don't know already is that you have your archive broken into chunks, (FILENAME.RAR, FILENAME.R00, FILENAME.R01, etc.) and you have separate PAR files (FILENAME.PAR, FILENAME.P00, etc.) You can make as many PAR files as you like up to the number of total chunks. The beauty of PAR files is that, should ANY chunk get corrupted, you can rebuild it with ANY PAR file. So, if file #10 of 20 gets fucked up, and you've got a single PAR file, you can rebuild part 10.
How many PAR files you create depends on how much reliability you want to have. You can have one, or two, or five... the thing is, you can only use one PAR file to replace one chunk. So, if part #10 of 20 is busted, and part #17 of 20 is busted, you can't use the same PAR file to rebuild them both. You'd need two PAR files.
Anyway, RAR+PAR. It's the standard. "What standard!? There's no standard!" I hear you scream. The USENET standard, that's what. And as goes USENET, so goes the world.
posted by Civil_Disobedient at 3:08 AM on January 3, 2008
I use the freeware version of SyncBack. You can set it to compress your files (using zip compression) and it can be set to zip each individual file, which gets you past the FAT32 file-size limitation (not that you have that issue, but others might). This Lifehacker article turned me onto it.
posted by wheat at 6:49 AM on January 3, 2008
posted by wheat at 6:49 AM on January 3, 2008
If you are considering Civil_Disobedient's suggestion of using parity archives as a safeguard for your data, I'll throw my hat in with a recommendation to use ICE ECC (freeware). Same basic concept, but faster, supports dualcore, more features (such as subdirectory support), and is updated on a more regular basis than QuickPAR (which hasn't had an update in years). I use this program with my backups for exactly the reason quoted by C_D, but I suspect you may not want to do this if you've got a slower machine since the generation of these files takes quite a while, especially for something in the order of gigabytes of data.
posted by tra at 7:13 AM on January 3, 2008
posted by tra at 7:13 AM on January 3, 2008
Comodo Backup is a free backup utility that will make ZIPs.
posted by kindall at 4:25 PM on January 3, 2008
posted by kindall at 4:25 PM on January 3, 2008
actually what I am backing up is non-application stuff, like documents, graphics (like Illustrator files), programming projects, etc.
If you've got 16 gigabytes of this stuff you should be archiving incrementally so you don't have to keep backing up everything you have. Maybe by year, project, etc., but it sounds like you probably have some stuff you don't access very often. Put that stuff in the closet and be done with it.
posted by rhizome at 10:50 AM on January 4, 2008
If you've got 16 gigabytes of this stuff you should be archiving incrementally so you don't have to keep backing up everything you have. Maybe by year, project, etc., but it sounds like you probably have some stuff you don't access very often. Put that stuff in the closet and be done with it.
posted by rhizome at 10:50 AM on January 4, 2008
This thread is closed to new comments.
posted by Steven C. Den Beste at 4:38 PM on January 2, 2008