Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
> The prior info might also explain why rsync is slow in this situation. > With your use case of a sparse file that's only about 10% used, and > your > point that it still takes time to process the zeros produced by the OS, > which rsync then has to calculate an MD5 hash of, it can take a while. Here's a benchmark. These are empty TrueCrypt volumes, so the nonsparse file takes 5G on disk, while the sparse one takes 256K on disk, and is "apparently" 5G in length. $ time cat truecrypt-5G-sparsefile.tc > /dev/null ; time cat truecrypt-5G-nonsparsefile.tc > /dev/null real 0m6.854s real 1m33.533s $ time md5sum truecrypt-5G-sparsefile.tc > /dev/null ; time md5sum truecrypt-5G-nonsparsefile.tc > /dev/null real 0m18.398s real 1m25.641s $ time gzip --fast -c truecrypt-5G-sparsefile.tc > /dev/null ; time gzip --fast -c truecrypt-5G-nonsparsefile.tc > /dev/null real 0m37.922s real 4m35.956s > What you really need is a hypothetical sparse_cat that is file system > aware and can efficiently skip over the unused sectors. Or better yet, > the equivalent functionality built-in to your archiving tool. I agree, that would be nice. However, as I mentioned above, you may be overestimating the time to read or md5sum all the 0's in the hole of sparse files. The hypothetical sparse_cat would improve performance, but just marginally. > Basically they use a VMware tool to backup the VM image, and then rsync > that backup file. Oh la la. That might be ok for them, having already bought the license for other purposes, but it's $995 or higher, as far as I can tell.
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |