Ping a Specific Port

Question

Industrial

Asked: 2011-09-25 12:51:46 +0800 CST2011-09-25 12:51:46 +0800 CST 2011-09-25 12:51:46 +0800 CST

Bandwidth-friendly backup strategy using tar?

772

We've used tar to backup and compress (gzip) selected directories on our file server with very good results until recently.
Each and every one of our backups are stored on mirrored (RAID) harddrives and simultaneously uploaded to a Amazon S3 bucket for off-site storage.

As our data has grown rapidly in size recently, so has also our backups. This week, our backup uploads have run 24/7 constantly just to sync the fresh backups from the last 7 days and still hasn't finished. Getting a better connection would solve some of this problem (which we can't do at the moment), but I think that it should be better to create a real solution instead of going for a workaround.

What alternative strategy, that keeps us away from multiple-digit gigabyte files and still lets us use tar, could we use to backup our directories that would reduce the amount of bandwidth needed to sync the files?

3 Answers

Voted

ewwhite · Answer 1 · 2011-09-25T13:09:57+08:00

ewwhite

2011-09-25T13:09:57+08:002011-09-25T13:09:57+08:00

Here's a commercial recommendation. Cactus Lone-Tar is a full backup suite that generates archive files that are extractable and listable using tar, even when written to tape. That's handy because you don't need the software to restore an archive. It's my go-to solution for standalone Linux server backup.

Lone-Tar now has an online component that can integrate with a a bundled offsite storage package or a remote Linux server. Because this is a backup software suite, it maintains a proper catalog and can accommodate FULL, INCREMENTAL and SELECTIVE backups.

1

Andrew Case · Answer 2 · 2011-09-25T13:36:38+08:00

Best Answer

Andrew Case

2011-09-25T13:36:38+08:002011-09-25T13:36:38+08:00

A lot of unknown variables here. What the size of your backups are, what your bandwidth limits are, do you want incremental or full backups, etc.

A few suggestions regardless:

Use rsync over ssh while using compression (-C option). Rsync would greatly reduce the amount of data needed for transfer on each backup. The compression would also reduce the amount of bandwidth required.
If bandwidth is limited, consider backing up to local disks. If you want offsite backups, you can always mail them offsite. As storage space explodes, you really shouldn't eliminate this as a valid option since bandwidth hasn't increased to match.

[edit] I noticed you the incremental tag. Does Amazon S3 bucket provide support for snapshots? That would take care of the incremental aspect.

0

Ram · Answer 3 · 2011-09-25T13:43:18+08:00

Ram

2011-09-25T13:43:18+08:002011-09-25T13:43:18+08:00

Use rsync over ssh. If you want to keep historic versions you can set the -b and related options. If you are married to tar you could use the -z flag if you don't already to compress. You can go farther by taking advantage of the 'archive' bit on your filesystem using the dump command so that, like with typical rsync usage, only files that have changed since the last dump or sync will be copied over.

0

Bandwidth-friendly backup strategy using tar?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?