Faster way to upload to Amazon S3?
May 20, 2009 9:10 AM   Subscribe

Is there another way to upload ~15GB of data to Amazon S3 besides my slow DSL line?

I signed up for an Amazon S3 account for off-site backups and am using Panic's Transmit to upload ~15GB of data. My AT&T Pro DSL connection tops out at about 50KB/s during uploads (according to Transmit), so this is going to take forever. Do I have any other options?

I'm running Mac OS X 10.5.7 and am also backing up to an external drive via Time Machine.
posted by DakotaPaul to Computers & Internet (25 answers total) 3 users marked this as a favorite
 
Are you a student, or do you have any affiliation with a local university? Most colleges have fat pipes up and down, which would make this process way faster. (I get about 1MB/s, which would reduce the time to a few hours). Otherwise, I'd say just throttle your upload so it uses, say 30kb/s, and let it run in the background for a week.
posted by chrisamiller at 9:20 AM on May 20, 2009


It's 88 hours, which is not too bad if you only have to do it once. The key is to come up with a backup system that only uploads the changed files, once you get the initial image on there.
posted by smackfu at 9:26 AM on May 20, 2009


Response by poster: Unfortunately, I'm not a student; access to a fat pipe would be nice.

That 88 hours is going to be spread over a long time, unfortunately. One thing I forgot to mention is that when I check Transmit in the morning, I nearly always see a message like "Unable to write (filename)." At some point during the night, Transmit, my DSL connection, or S3 hiccups and the transfer stops and I have to start it up again. Sometimes hundreds of files (I'm doing photos now) will be uploaded, sometimes 100 before it dies. It's a pain in the butt. And I am using the Synchronize option in Transmit to only upload new or modified files.

I start uploads before I go to bed and in the morning before work. If it's uploading in the evening when I may be online, surfing is pretty damn slow. And why is that? Why does uploading data make surfing—which is primarily downloading data, isn't it?— slower?
posted by DakotaPaul at 9:47 AM on May 20, 2009


When I check Transmit in the morning, I nearly always see a message like "Unable to write (filename)." At some point during the night, Transmit, my DSL connection, or S3 hiccups and the transfer stop.

Yup, that's Transmit. It's pretty, but sometimes it is so delicate it's infuriating. I ran into that problem often on uploads, for years, from many computers and many connections through many versions of the software.

Then I switched back to Interarchy and all such problems stopped. The new version is very nice.
posted by rokusan at 9:57 AM on May 20, 2009


Are you compressing the files before you attempt to upload?
posted by LuckySeven~ at 9:58 AM on May 20, 2009


Are you using WiFi? If so, plug yourself in to the router overnight. It could improve your connection's reliability.
posted by qxntpqbbbqxl at 9:58 AM on May 20, 2009


Surfing is primarily downloading, but there is still a certain amount of upwards bandwidth needed - you need to send a request for each page, image, script etc on every site. If Transmit is hogging all of your upload capacity then your requests have to compete with that, and it will slow your surfing down. I'm not familiar with Transmit, but maybe it has a setting to only use 80% of your connection like some bittorrent software does? That would be enough to get your web browsing back up to speed. This is the "throttling" that chrisamiller suggests. If Transmit won't do it, try Cyberduck or YummyFTP which certainly do.
Your best option might be to find someone you trust who has access to a fast connection from college or work, get them to upload the 15GB and then change your password.
posted by nowonmai at 10:04 AM on May 20, 2009


Response by poster: Are you compressing the files before you attempt to upload

Most of what I'm uploading are JPGs. Are there any apps that can compress them further?

I checked Transmit's settings, but couldn't find any bandwidth-limiting options like my BitTorrent clients have.

Thanks for the other FTP app suggestions. Will try those.
posted by DakotaPaul at 10:16 AM on May 20, 2009


I checked up; Transmit doesn't throttle bandwidth. You could try one of the other programs I mentioned: Cyberduck is free.

You can't compress your JPGs any further: you could make smaller, lower quality versions but I'm sure you don't want to.
posted by nowonmai at 10:36 AM on May 20, 2009


I started out uploading 100gb with transmit and other client on my slowish DSL connection. All of the clients had issues and I wasn't getting reliable uploads.

I ended up biting the bullet and purchasing Jungle Disk which seems considerably more robust than the other S3/FTP clients. It reliably uploading the 100gb (all jpegs and raw image files) and now I use it to auto sync my new photos too. I can't say enough good things about Jungle Disk!

Note: If you want to use Jungle Disk and other clients like Transmit make sure you use compatibility mode where the files are not encrypted.
posted by avex at 10:39 AM on May 20, 2009


Another vote for JungleDisk, it handles all the management/retry/scheduling/'only upload changes' stuff for you. Really quite an awesome app, and only a one-time $20 (or add $1/mo for extra features).

It took me 4 days for my initial upload, but it was all in the background and brainless thanks to JungleDisk.

Otherwise do as others suggest and copy it all to an external drive, then go to school/work & upload from there where the bandwidth is higher.
posted by jpeacock at 10:53 AM on May 20, 2009


Most of what I'm uploading are JPGs. Are there any apps that can compress them further?

What I meant was you should have the files in several different folders (categorize the folders whichever way you like), then compress (archive) the folders before uploading. Uploading single images or docs or pdfs, etc is never ideal when there are that many files. Not only is it time-consuming to upload, but the files can get corrupted when they're not zipped so your backups will be useless when you need them.
posted by LuckySeven~ at 10:55 AM on May 20, 2009


Response by poster: Uploading single images or docs or pdfs, etc is never ideal when there are that many files.

I take photos every day, and they're stored on my HD in a folder for each day, then a folder for each year. I could put them in an archive file for each month, but if those folders are several hundred megs or a gig each then when my transfer dies in the middle of the night—as it often does—that upload is completely shot and I'd have to do it again, wouldn't I?

Thanks for the Jungle Disk suggestions. Will check into that as well.
posted by DakotaPaul at 11:14 AM on May 20, 2009


Response by poster: then go to school/work & upload from there where the bandwidth is higher.

I would absolutely do that if I was confident our network guys wouldn't find out and I wouldn't have to look for a new job. :-)
posted by DakotaPaul at 11:18 AM on May 20, 2009


> I would absolutely do that if I was confident our network guys wouldn't find out and I wouldn't have to look for a new job.

Ask for permission?
posted by nowonmai at 11:58 AM on May 20, 2009


I take photos every day, and they're stored on my HD in a folder for each day, then a folder for each year. I could put them in an archive file for each month, but if those folders are several hundred megs or a gig each then when my transfer dies in the middle of the night—as it often does—that upload is completely shot and I'd have to do it again, wouldn't I?.

Not necessarily. I don't use Transmit, so I can't answer to the vagaries of that program, but the entire upload should not be a waste. If your 15GB worth of files are archived into 15 different folders of 1 GB each, for example, and your program craps out after uploading 11 of 15, you should only have to re-up the last 4.

Incidentally, many applications have a file size limit; Jungle Disk's, I believe, is 5GB, so you're probably going to have to spilt that 15GB file anyway. Good luck!
posted by LuckySeven~ at 12:31 PM on May 20, 2009


Correction: S3 imposes the limit, not JungleDisk.
posted by LuckySeven~ at 12:37 PM on May 20, 2009


Response by poster: If your 15GB worth of files are archived into 15 different folders of 1 GB each, for example, and your program craps out after uploading 11 of 15, you should only have to re-up the last 4.

Thanks for responding again, LuckySeven~, but what I meant was: If I have 15 one gigabyte files and the connection/Transmit craps out in the middle of transferring a file, I'll have to do it again. I've rarely been able to upload a gig of data overnight without a problem.
posted by DakotaPaul at 12:49 PM on May 20, 2009


Seconding nowonmai's suggestion to ask for permission to do it at work. If you ran the upload only at night and offered to cap bandwidth at some reasonable figure, you're highly unlikely to inconvenience anyone at all. You could probably finish the job in a night, two at the most.
posted by zachlipton at 2:24 PM on May 20, 2009


Best answer: Thanks for responding again, LuckySeven~, but what I meant was: If I have 15 one gigabyte files and the connection/Transmit craps out in the middle of transferring a file, I'll have to do it again. I've rarely been able to upload a gig of data overnight without a problem.

Yikes; that must be incredibly frustrating. I did some quick research on Transmit and noticed that v. 3.6.7 supposedly fixed some issues the app had with uploading to S3. If you're using 3.6.7 and still having it crap out, have you considered contacting the developer? I mean, you paid for the program; you deserve support.

Otherwise, you could try JungleDisk, as mentioned above, or even a free app like FileZilla. FileZilla is hideously ugly, but at least when an error occurs, it will just pass over the file and keep uploading the rest. Fetch and as nowonmai's suggested, Cyberduck, are also very highly regarded. S3Hub, and the Firefox extension S3Fox are a couple of other tools you may want to check out.
posted by LuckySeven~ at 3:51 PM on May 20, 2009


Funny, I've plugged this program before, but I have to say Super Flexible File Synchronizer is absolutely great at this sort of thing:

Here's the page on Amazon S3

You can set it to obey a speed cap, it's very robust about hiccups, create backups/versioning of updated files, and on and on. We beat it to death with hundreds of GBs of backups regularly and it keeps on chugging, on our LAN, via sftp to remote servers, and S3.

Free unlimited 30 day trial to see if it's right for you. I'm backing up a 40GB archive right now with some ~36,000 files and it works like a champ. For big transfers, I cache the destination file list so there isn't a huge delay in comparing directories for a sync.

I'm not affiliated with the company or product in any way, it's just been a lifesaver during some projects- we set it to run synchronizations throughout the day, and if a file gets overwritten or corrupted locally, there are multiple backup versions to roll back to.

One last suggestion for backup speed- many small files slows things down awfully. You might want to zip your files together (or into multiple logical groups of files) for the upload. On a slow DSL line I can only imagine the pain.
posted by FrotzOzmoo at 11:19 PM on May 20, 2009


Just got an email advertising http://aws.amazon.com/importexport/

Might be worth a look?
posted by rus at 1:19 AM on May 21, 2009


Rus: heh I was just about to post the very same thing. It's clearly pretty darn expensive for this purpose ($80 per drive charge), but at least you now have the option of throwing money at the problem and having it solved pretty painlessly. I'd do more than 15GB while you're at it to make it more worthwhile.
posted by zachlipton at 1:30 AM on May 21, 2009


Response by poster: Wow, thanks for all the suggestions, everyone! I downloaded Cyberduck last night and began an upload before bed, and it was still chugging away this morning.

I did some quick research on Transmit and noticed that v. 3.6.7 supposedly fixed some issues the app had with uploading to S3. If you're using 3.6.7 and still having it crap out, have you considered contacting the developer? I mean, you paid for the program; you deserve support.

I will definitely check my version of Transmit after work today, and will contact them if I'm still seeing problems. And thank you for the other suggestions, LuckySeven~. You've all been very helpful!
posted by DakotaPaul at 8:28 AM on May 21, 2009


Update: Amazon has just started a new service called AWS Import/Export that's designed to move large amounts of data to S3 storage without the user uploading it. In other words, you ship them your drive and they'll do the work for you. Here's an article about it at arstechnica and the direct link from Amazon is here.
posted by LuckySeven~ at 9:51 AM on May 27, 2009


« Older Is this wise / How unwise is this?   |   Is my health insurance a rip-off? Newer »
This thread is closed to new comments.