How do I measure daily adds/changes to a set of data, for backup purposes?
June 10, 2008 12:41 PM   Subscribe

How do I measure daily adds/changes to a set of data, for backup purposes?

So we just added another office; each has been doing its own backups and taking them offsite. We'd like to start backing up each office to the other office (over a VPN) as the offsite backup. I back you up, you back me up. They are in different regions, so that seems better than tapes floating around.

But I don't know how much we'll be able to deliver overnight over our pipes - the upload speed on both DSL lines is only 768kbps. I know it will take a long time to do the initial full backup; but how do I measure how big the daily incrementals will be?

I'm thinking of some way to measure the size of the data on a Monday, then track how many MB/GB is changed/added each day, track that over the course of a week or two, and use that to figure out if we can do this with our existing lines or if we need to invest in fatter pipes.

Anyone know of a way to do this without being a sysadmin/engineer etc? I can figure out most things, but I'm not going to be able to write my own scripts or whatever.
Suggestions?
posted by penciltopper to Computers & Internet (6 answers total) 1 user marked this as a favorite
 
It's not necessarily that easy.

In the case of plaintext, you could just do a diff on the old file and the new one to see what was changed. But if it's binary, you - to vastly oversimplify - will basically have to copy the entire file every time, instead of being able to copy a diff file. If your files are binaries, you'll definitely have to copy the entire changed file, even if you only made very minor changes to it.
posted by Tomorrowful at 12:50 PM on June 10, 2008


You can do a trial locally with rsync

Assuming it's all in one place, make a copy of the directory, wait a day while people modify as usual, then do an rsync between the two locations with the --dry-run flag. This will give you a listing of files that would be transferred, along with the total size of the transfer.

Lots more details on rsync can be found with some google searches. There are many utilities that allow you to use rsync through a GUI-type interface. A lot of the details depend on the size of what you're backing up, and the availability of storage you have.
posted by chrisamiller at 12:51 PM on June 10, 2008


If you use rsync for backups, you could use the --stats argument to log statistics on a daily basis, to help answer these questions.
posted by Blazecock Pileon at 12:53 PM on June 10, 2008


Response by poster: Oh, I should add that what's being backed up is the contents of several fileshares on two Windows Server 2003 machines, backed up to external drives. I don't know enough to feel comfortable messing with the servers themselves; I need something I can run from a spare workstation - something that's not the server itself?
posted by penciltopper at 1:40 PM on June 10, 2008


If you dont want to script, you could easily right click properties to find the size of the fileshares once or several times a day and just log it in Excel.

Or if you know basic command line skills, you could save the output of "dir /s" to a file, which will list the total file size.
posted by wongcorgi at 2:39 PM on June 10, 2008


Response by poster: @ wongcorgi: Just logging the size of the share won't get me the Changes made to the data, which I'm sure are more bytes per day than the Additions. I'll need both to figure out

I'm not against scripts, etc; I just don't know how to write my own - and can't learn in the time frame given to me to come up with an answer. The real answer is that we need an IT person in-house or more readily available, but that's going to take even longer. Right now, I just want to stop having to manage all these tapes. :)
posted by penciltopper at 2:49 PM on June 10, 2008


« Older Copyright law for online images   |   please help the bespectacled leprechaun! Newer »
This thread is closed to new comments.