Help me make sense of my AmazonS3 usage report!
November 3, 2008 6:36 AM   Subscribe

I use JungleDisk to back up my data to Amazon S3. My current bill is about 5 times higher than the previous one and I'd like to find out why. Unfortunately, the usage report is not presented in a format easily understood by mere mortals. Is there an application or a service (free or very cheap) out there that translates this data into something readable by humans?

I found this service, but after attempting to sign up for the trial and linking to my account I don't think it can do what I want it to. At least I can't figure out how to make it work...
posted by jluce50 to Computers & Internet (15 answers total) 1 user marked this as a favorite
Is the higher bill due to data storage or transfer? Jungledisk by default keeps old files for a certain amount of time. If you create and delete a lot of files this can greatly increase storage use.
posted by Brennus at 7:28 AM on November 3, 2008

Response by poster: Not sure. That's basically what I'm trying to find out...
posted by jluce50 at 7:47 AM on November 3, 2008

If you click on Help -> Account activity report it takes you to the Amazon website where you can see your charges broken down as storage vs. transfer (+PUT and GET requests).
Compare the two bills and if it is storage that has grown a lot than check the settings for keeping deleted files in Jungledisk. If it's transfer that has grown, you might want to sign up for Jungledisk plus to enable incremental transfers.
posted by Brennus at 7:53 AM on November 3, 2008

My current bill is about 5 times higher than the previous one and I'd like to find out why.

The first thing you need to ask in any of these kind of situation is- what changed? I'm also assuming that this increase is after several months of consistent, lower billing, yes?

What, specifically, are you backing up, and what OS? You need to be really careful not just with a lot of small files, as others have noted, but with programs that keep their DB in a single, large file- backup software looks at it, says "yes, it's changed", and transfers it over each time. Some mail programs, for example, do this. So, you've got constant bandwidth charges. Since it's still considered the same file, though (and I assume JungleDisk doesn't do versioning), you won't have the "lingering deleted files" problem with this.

Also be careful with making sure you're not backing up things like:

- temp directories (which are often subdirectories of stuff you do want backed up)
- mail attachment download directories
- caches
posted by mkultra at 8:03 AM on November 3, 2008

and I assume JungleDisk doesn't do versioning

It does.
posted by Brennus at 8:09 AM on November 3, 2008

Ah, there you go- even worse. Big DB file + lots of changes = big backup trouble. Does JungleDisk keep a log?
posted by mkultra at 8:18 AM on November 3, 2008

Response by poster: Geez, I totally missed the "Expand All" link on the Billing Statement. What a dork!

Anyway, here are the breakdowns for the last three statements. FWIW, it's been steady at around $5 for quite a while. I've added a lot of files over the last couple of cycles, so that explains the increase in "data transfer in". However, even with saving of deleted files enabled (I think I have it set to save for 30 days) how does that explain the HUGE increase in "storage used"? How does adding 4GB increase "storage used" by almost 100GB? What am I missing?

$0.15 per GB-Month of storage used 167.298 GB-Mo 25.09
$0.10 per GB - all data transfer in 3.798 GB 0.38
$0.17 per GB - first 10 TB / month data transfer out 0.000092 GB 0.01
$0.01 per 1,000 PUT, POST, or LIST requests 1,631 Requests 0.02
$0.01 per 10,000 GET and all other requests 441 Requests 0.01
Charges due 25.51

$0.15 per GB-Month of storage used 72.913 GB-Mo 10.94
$0.10 per GB - all data transfer in 4.911 GB 0.49
$0.17 per GB - first 10 TB / month data transfer out 0.000441 GB 0.01
$0.01 per 1,000 PUT, POST, or LIST requests 10231 Requests 0.10
$0.01 per 10,000 GET and all other requests 2088 Requests 0.01
Charges due 11.55

$0.15 per GB-Month of storage used 44.041 GB-Mo 6.61
$0.10 per GB - all data transfer in 1.893 GB 0.19
$0.17 per GB - first 10 TB / month data transfer out 0.000097 GB 0.01
$0.01 per 1,000 PUT, POST, or LIST requests 2776 Requests 0.03
$0.01 per 10,000 GET and all other requests 454 Requests 0.01
SLA Service credit for July (what's this?) Credit -1.38
Charges due 5.47
posted by jluce50 at 8:32 AM on November 3, 2008

It depends on what kind of files you are backing up. Email file, for example, often are just one large file and every change (new email, deleted email) would cause a new transfer or version of the file. If your file is relatively big this could account for the massive increase in storage. If you look in Jungledisk under Logs -> View Backup History, you can see what files were uploaded in each session. You should check there and see what files are updated and how large these files are.

Seeing how you have not much data transfer but a lot of increase in storage, it most likely is the versioning causing the increase in data. I would turn off versioning for now and perform a Bucket -> Backup cleanup to solve this problem. If you really want versioning, you have to figure out which files are large and changed often.
posted by Brennus at 8:49 AM on November 3, 2008

Your bill is higher because you each month you have stored more and more on Amazon S3 servers. You are charger for up and downstream bandwidth as well as the cost to store the data there each month. You've amassed 167GB of data being stored in the S3 cloud being billed at fifteen cents per GB per month. That's where the bulk of your most recent bill is coming from.

This is the nature of S3, it's cheap and efficient, but perhaps not for backup in measured in hundreds of gigabytes. $27/mo for that still isn't bad at all.
posted by cgomez at 8:55 AM on November 3, 2008

Oh, and the SLA credit is referring to a Service Level Agreement that Amazon has with their customers. In July, S3 suffered about a day of widespread downtime, so they're refunding some period of service to make up for that time the service was inaccessible.
posted by cgomez at 9:20 AM on November 3, 2008

There should be a preference in the Jungledisk app to not keep backed up files that you have deleted.

AFAIK if you are using it for backups and delete something locally then Jungledisk by default keeps the deleted files for a period of time. There is an option in prefs: previous versions... mine is set to remove previous versions of changed files after 60 days. At 15c a gig I don't mind for the added peace of mind!

But that looks to me what your issue is
posted by twistedonion at 9:42 AM on November 3, 2008

Response by poster: I understand the idea behind file versioning and I understand how their billing works. I guess what I don't understand is how their versioning works exactly. The disconnect is between the 4GB of "data in" bandwidth and the increase in the "storage used" of almost 100GB. Even if previous versions were saved for every single file uploaded (they weren't) that would only account for 10GB of growth (assuming the new versions were roughly the same size). I don't have any Outlook .pst files or anything that would be replacing a previous version with something drastically larger. The vast majority of files are pictures and music. The rest are home movies, documents, etc.
posted by jluce50 at 11:13 AM on November 3, 2008

It looks like your data used shouldn't have jumped from 72 GB to 167 GB with only 4GB of transfer. Have you tried contacting Amazon?
posted by wongcorgi at 11:38 AM on November 3, 2008

You should definitely talk to someone at amazon and get a clarification on those numbers because unless you have some other way of transferring data in that doesn't show up on your data transfer bill, those numbers don't add up:

Starting point: 44GB
Next month you transfer in 5GB and your storage goes up to 73GB
In the next month you transfer in 4GB and your usage goes up to 167GB

In the first month your storage usage goes up 29GB but they say you only transfered in 5GB. In the next month it goes up 94GB but says you transfered in only 4GB.

Either their storage numbers are wrong, their transfer numbers are wrong or something is happening on the s3 end to duplicate the data. Is it possible that you're uploading the files compressed and s3 is decompressing the files when they're received?

Is there any way you can examine directly what you have stored? That should help you find out where that extra 100+GB comes from.
posted by missmagenta at 1:34 PM on November 3, 2008

It looks like Jungledisk uses both S3 (Amazon's storage service) and EC2 (Amazon's "computing-on-demand" service). Though you're charged for transfer to/from S3, you are _not_ charged for transfer between S3/EC2.

Thus, if the Jungledisk EC2 servers are copying/adding/removing data on EC2 to version it in some way, it's possible you could end up with the discrepancy in data transfer/storage you're seeing.

I'd contact Jungledisk. I'd be extremely surprised if this was a bug on Amazon's side. I'd guess it's a side effect of how Jungledisk does its versioning.
posted by bitterpants at 1:37 PM on November 3, 2008

« Older NerdFilter: Should I buy the 4th Edition D&D...   |   Lots of WMAs to lots of MP3s Newer »
This thread is closed to new comments.