Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On Fri, Jan 2, 2009 at 3:05 PM, John Abreau <jabr-iwcNaMm7aMIiq3RsQ1AnAw at public.gmane.org> wrote: > Hi, Jay. > > That's pretty much what I assumed the process would be. The description > doesn't address my two concerns, though: > > 1. By mounting it as a filesystem and then running rsync on top of that, > rsync sees the s3 filesystem as a "local" filesystem, and therefore as part > of the process of checking if a file needs to be updated, it copies the > entire file from s3 to generate its hash for comparison. Rsync to a remote > system invokes rsync on the remote end to compute the hash,and avoids > the bandwidth usage that the "local" rsync uses. > > 2. The rsync snapshots process uses hard links to make each daily backup > directory look like a complete filesystem -- daily.0, daily.1, > daily.2, etc. are > all complete filesystems from different days, but files that are the same in > all of these are hard-linked to a single instance, so it doesn't waste storage > space with multiple copies of the same file. Is it possible to do the same > with an s3-based solution? > John, You may be able to use a combination of Amazon S3 for storage and Amazon EC2 to handle the rsync snapshots. I came across this howTo for accessing S3 data using an EC2 instance: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=931&categoryID=100
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |