Sync two servers?
March 19, 2008 9:38 AM Subscribe
IT Geek Filter: A better way to sync millions of files between very remote servers?
Currently I am syncing two file servers, one on the west coast one on the east coast, consisting of several million files, using Double-Take. The syncing performance basically sucks, as does the operation of Double-Take. Does anyone know of a better solution or better software to accomplish this task?
Currently I am syncing two file servers, one on the west coast one on the east coast, consisting of several million files, using Double-Take. The syncing performance basically sucks, as does the operation of Double-Take. Does anyone know of a better solution or better software to accomplish this task?
we do something like this using plain old rsync
yeah, the initial sync takes forever (we let it run over the weekend), but the nightly deltas usually only take a few minutes, and we run them at midnight just to be safe
posted by Oktober at 9:45 AM on March 19, 2008
yeah, the initial sync takes forever (we let it run over the weekend), but the nightly deltas usually only take a few minutes, and we run them at midnight just to be safe
posted by Oktober at 9:45 AM on March 19, 2008
Response by poster: Thanks, but I'm looking for something a little more "Enterprise" than mailing files, also there isn't anyone at this remote site.
posted by Cosine at 9:47 AM on March 19, 2008
posted by Cosine at 9:47 AM on March 19, 2008
Response by poster: I use rsyc for some smaller servers but for this situation rsync has been tried and couldn't handle it, perhaps because these are Windows servers and making rsync for on windows isn't great.
There is enough rate of change for these files that a nightly delta style of sync does not work, files need to be flying 24/7 to keep up. A lot of the overhead is the actual file comparison. It takes Double-Take as much as two weeks to fully compare all the files and if either server has any kind of glitch the file comparison task has to start over from scratch.
posted by Cosine at 9:52 AM on March 19, 2008
There is enough rate of change for these files that a nightly delta style of sync does not work, files need to be flying 24/7 to keep up. A lot of the overhead is the actual file comparison. It takes Double-Take as much as two weeks to fully compare all the files and if either server has any kind of glitch the file comparison task has to start over from scratch.
posted by Cosine at 9:52 AM on March 19, 2008
Response by poster: odinsdream: That is fine for the initial sync AND that is how I initially did it but that's not the problem, KEEPING them in sync is the problem.
posted by Cosine at 9:53 AM on March 19, 2008
posted by Cosine at 9:53 AM on March 19, 2008
I'm not sure if it's been mentioned, but rsync =) Also, if you're using Windows machines, deltacopy.
posted by bertrandom at 10:05 AM on March 19, 2008
posted by bertrandom at 10:05 AM on March 19, 2008
What is your connection speed? File comparison should not take that long.
posted by mphuie at 10:06 AM on March 19, 2008
posted by mphuie at 10:06 AM on March 19, 2008
Response by poster: Bandwidth isn't the problem, latency is, the task of comparing 10,000,000 files doesn't even max out the connection speed now.
posted by Cosine at 10:09 AM on March 19, 2008
posted by Cosine at 10:09 AM on March 19, 2008
Also, you should be looking in to WAAS, it's ideal for these types of things.
posted by iamabot at 10:10 AM on March 19, 2008
posted by iamabot at 10:10 AM on March 19, 2008
Whatever rsync did was worse than two-weeks-plus-errors? How?
rsync is pretty much the gold standard for this sort of thing. You'll probably be best off figuring out on your own (since you neglect it here) what is wrong; there's probably something trivial in the way.
posted by cmiller at 10:12 AM on March 19, 2008
rsync is pretty much the gold standard for this sort of thing. You'll probably be best off figuring out on your own (since you neglect it here) what is wrong; there's probably something trivial in the way.
posted by cmiller at 10:12 AM on March 19, 2008
Response by poster: I'm guessing that everyone here suggesting Rsync on Windows hasn't had the pleasure of having Rsync decide after each DST change that ALL files are out of sync and must be recopied in their entirety.
posted by Cosine at 10:14 AM on March 19, 2008
posted by Cosine at 10:14 AM on March 19, 2008
rsync for sure.. if you really do have that many files and the inital scan takes an hour plus, you can break it up into multiple streams with much luck:
if you are rsyncing
/file_system
do:
/file_system/branch_a
/file_system/branch_b
...
so on
posted by joshgray at 10:22 AM on March 19, 2008
if you are rsyncing
/file_system
do:
/file_system/branch_a
/file_system/branch_b
...
so on
posted by joshgray at 10:22 AM on March 19, 2008
Here's an article on the rsync Windows DST problem with a couple of fixes:
http://www.samba.org/rsync/daylight-savings.html
posted by pocams at 10:22 AM on March 19, 2008
http://www.samba.org/rsync/daylight-savings.html
posted by pocams at 10:22 AM on March 19, 2008
If you're gunshy about rsync:
Assuming you have immediate or remote access to server A or B, winscp (free) and SecureFX (paid) each have directory synch features, with incremental/diff-only options.
The only gotcha I see is that they both operate with the "local to remote" metaphor, as opposed to "remote to remote", hence the part about immediate access to one of the machines.
posted by bhance at 10:27 AM on March 19, 2008
Assuming you have immediate or remote access to server A or B, winscp (free) and SecureFX (paid) each have directory synch features, with incremental/diff-only options.
The only gotcha I see is that they both operate with the "local to remote" metaphor, as opposed to "remote to remote", hence the part about immediate access to one of the machines.
posted by bhance at 10:27 AM on March 19, 2008
seconding Unison
posted by qxntpqbbbqxl at 10:30 AM on March 19, 2008
posted by qxntpqbbbqxl at 10:30 AM on March 19, 2008
Unison has proved useful for me in the past (rsync shy).
posted by holgate at 10:30 AM on March 19, 2008
posted by holgate at 10:30 AM on March 19, 2008
I'm guessing that everyone here suggesting Rsync on Windows hasn't had the pleasure of having Rsync decide after each DST change that ALL files are out of sync and must be recopied in their entirety.
By default rsync uses modification time and size to decide if a file has changed. It can use a checksum with the -c option (however computing checksums is (obviously) more processor intensive than looking at file attributes). Or, you could use --size-only to turn off the mtime check. You can also use --modify-window if your timestamps are off (in fact, the documentation says you're supposed to use this on FAT partitions).
posted by paulus andronicus at 12:08 PM on March 19, 2008
By default rsync uses modification time and size to decide if a file has changed. It can use a checksum with the -c option (however computing checksums is (obviously) more processor intensive than looking at file attributes). Or, you could use --size-only to turn off the mtime check. You can also use --modify-window if your timestamps are off (in fact, the documentation says you're supposed to use this on FAT partitions).
posted by paulus andronicus at 12:08 PM on March 19, 2008
Nthing rsync. Are you using -z (--compress), --partial(-dir), and possibly --checksum?
posted by hattifattener at 6:03 PM on March 19, 2008
posted by hattifattener at 6:03 PM on March 19, 2008
Do research on WAN Optimazation. There are tons of products in that marketspace, some cheaper then others.
posted by JintsFan at 1:35 PM on March 26, 2008
posted by JintsFan at 1:35 PM on March 26, 2008
This thread is closed to new comments.
posted by Cat Pie Hurts at 9:40 AM on March 19, 2008