Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] Help with data migration



Greg Rundlett wrote:
> I'm not having good luck with my data migration. 
> 
> In the past, on boxes I've setup, rsync just flies.  In this case, we've
> got 10gE cards and robust hardware, quad core, 16GB RAM in the VM instance
> and VMWare ESX running on the iron, and something is just not right.  In
> summary, after NFS mounting the source, and switching to cp, I'm getting
> speeds as slow as 2.2MB/s for transfer.  rsync "locally" is only doing
> about 9MB/s.  I'm not sure how to check where the bottleneck is or what's
> wrong.
> 
> Network performance
> I already calculated the performance of the transfer end-to-end, and found
> that it was too slow (<10 MB/s).
> 
> The rate is 483 Mbits/sec which is *59 MB/sec*
> 
> Although the network is the slowest component, it still is capable of much
> more throughput than what I'm witnessing with the rsync transfer.

So what was the conclusion to this story? Any lessons to share?


> After successfully mounting the source filesystem over NFS, I did a small
> test using cp which showed a rate of 29 MB/s which although "slow" is 3
> times faster than the rates I've been getting from rsync.  So I switched to
> using cp instead of rsync and found that the transfer speed on a couple
> larger datasets came in at 2.22 MB/s - 4.3 MB/s

It wasn't clear from the original description you posted describing the
environment whether the storage server was a SAN appliance, or some sort
of server that you could run things on.

I also don't recall if you mentioned trying to use the rsync protocol
over the network, rather than using it between a local disk and an NFS
mounted disk, which is one of the least efficient ways to use rsync.

And I was surprised none of the "tar to a pipe" proponents reiterated
that solution here. The classic tar piped through rsh, that then runs
tar on the target storage server, should be an ideal fit for the initial
bulk transfer. It streams quite efficiently over a single socket
connection with no delays for high-level ACK/NAKs (only TCP).

Though if you really want to saturate your link, you want something more
specialized, like BBCP:
http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm

Although it is designed for maximizing transfer rates (using parallel
streams) over WAN links, despite latencies that normally delay ACK
packets and throttle throughput.

However, it sounds like you had an underlying LAN problem that needed to
 be understood and resolved first before resorting to creative solutions.

 -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/



BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org