[Discuss] rsync v. cp in data migration

Tom Metro tmetro+blu at gmail.com
Thu May 23 17:02:44 EDT 2013


Greg Rundlett wrote:
> ...cp can be used to make a phase one copy, and then (shut off write
> access to source) rsync can be used to do the final copy.

I've successfully used this 2-phase technique on several occasions to
migrate data sets in the neighborhood of a few terabytes.

Sure, tar might be faster than cp, but if you are using cp to do a
non-time critical bulk transfer, speed likely won't matter much. (It may
even be possible that the pauses cp takes to refill its buffers results
in it saturating your I/O bandwidth less, which could be desirable if
you are running this job while the disks are in use.)

As pointed out, sparse files may be a concern, but you may know by the
nature of your data that you don't have sparse files. (i.e. if you know
they are all of a certain type or managed by a VCS tool that doesn't use
sparse files.) There's probably a test you can run to check for the
presence of sparse files.

 -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/



More information about the Discuss mailing list