[Discuss] rsync v. cp in data migration
Bill Bogstad
bogstad at pobox.com
Sat May 25 03:41:58 EDT 2013
On Fri, May 24, 2013 at 11:29 PM, Alex Pennace <alex at pennace.org> wrote:
> On Fri, May 24, 2013 at 08:47:37PM -0400, Steve Harris wrote:
>> 1) Using a tar pipeline will (should) always be slower than a single
>> process (e.g., cp, cpio -p, rsync), because of the overhead of the two
>> processes and the system buffering for the pipe.
>
> Not necessarily. Earlier in this thread, someone mentioned the
> sendfile(2) system call in Linux. sendfile is largely limited to
> sending data out via a socket.
According to the manual pages that hasn't been true since the 2.6.33 kernel.
I personally wrote a test a program yesterday before mentioning sendfile() and
used it to copy files on a Ubuntu 12.04 system (3.2 kernel). Unless glibc is
doing something funky, it would appear to work on regular files on output now.
>....
> The big drawback to splice(2) is one of its ends must be a pipe. Our
> modified tar will have to take care to employ it only when its dealing
> with a pipe (on the other hand, GNU tar already does an fstat on its
> output to check to see if it is going to /dev/null).
Another possible drawback (probably with sendfile() as well), is
I have no idea how splice() or sendfile() are going to copy sparse files.
The kernel certainly has access to the information needed to do it
right for sendfile(), but I would surprised if it actually did.
BTW, I agree with Rich's complaint about the lack of tools to support
all of those filesystem extensions which kernel developers keep
adding. As a (former) system administrator, it bugs me.
Finally, I haven't seen fsarchiver mentioned. It's a descendent of partimage
and claims to copy "everything" and according to its Changelog it handles sparse
files correctly.
Bill Bogstad
More information about the Discuss
mailing list