Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] rsync v. cp in data migration



Derek,

I suspect that your timings are off because the test isn't quite apples
to apples. GNU tar's -S completely reads every file twice: once to test
sparseness, again to add it to the archive. GNU cp's documentation[1]
suggests that the --sparse=auto test algorithm may be smarter than that.
I suggest retesting with sparse file handling disabled: no -S with tar,
--sparse=never with cp.

Try this (as root) instead of the dd trick:

  sync && echo 3 > /proc/sys/vm/drop_caches

This will force the kernel to flush its buffers then drop the various
caches and free the associated RAM. The caches will fill up again so
you'll need to do this before each test to clean things out.

Here's another thing that might be relevant. tar is a bit of a CPU hog,
and the tar pipe trick invokes two separate tar processes which can make
it more CPU bound than cp -r.

One last thing: lots of small files. Thousands. Try using /usr as a source.


[1] I must apologize about my statements about cp not handling sparse
files. GNU cp, at least in the current coreutils, defaults to
--sparse=auto. YMMV with older versions and non-GNU versions of cp.

-- 
Rich P.




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org