BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Souper Computer

Subject: Souper Computer
From: slipcon at cs.jhu.edu (Scott Lipcon)
Date: Thu, 25 Feb 1999 01:39:31 -0500
In-reply-to: Your message of "Thu, 25 Feb 1999 00:42:14 EST." <199902250542.AAA19437@kukla.tiac.net>

Hi,

I've been reading this list for a while, but never posted.  I live in 
the Boston area, but I'm at school at JHU in Baltimore.  The local 
chapter of the ACM, of which I'm an active member and officer, got a 
grant last fall from a professor to build a Beowulf, so I'd be happy to 
answer any questions.  We recently got all the hardware issues worked 
out, and in fact performed our first parallel computations last week.  
The basic steps are:

- get a bunch of computers.  They don't have to be the same speeds, but 
it helps in terms of load balancing, and in ease of installing linux.  
We have 7 Pentium II 350s, with 128Mb of RAM and 6.4gb ide drives, and 
1 PII 350, with 384Mb of RAM and 2 9Gb Ultra Wide SCSI drives, to act 
as a file server
- network them together. This is pretty important.  "real" parallel 
computers have very high speed interconnect busses... I'd say that is 
the single biggest drawback of a beowulf type system - your parallel 
workload had better be very large grained, otherwise your 
communications latency and throughput will kill you.  Our network is 
100Mb switched ethernet.  I've heard of people using normal 10Mb 
ethernet for smaller clusters, that they're setting up just for fun, 
but it probably doesn't work too great.
- install linux on them all.  the Extreme Linux distribution from 
RedHat is good, but its based on RedHat 5.0.  We chose to install 
RedHat 5.2 straight.
- Now you've got a network of workstations.  We found the next logical 
step was to figure out the logistics: NFS mounting /home from the 
"master" node, cross mounting every drive on every machine via NFS, 
exporting the password file from the master, getting rlogin to work 
without any passwords, even for root, etc...  There are a lot of things 
you have to worry about, especially if the system is going to be 
attached to the internet.
- Finally, you can install the tools that make it a Beowulf instead of 
a network of workstations - there are RPMS available on the web that 
make this a snap.  PVM (Parallel Virtual Machine) is one such library 
that you can use to write parallel programs.  MPI is the other popular 
one (Message Passing Interface)

As well as playing with this Beowulf, I'm taking a very interesting 
course in Parallel Processing... the biggest thing I can stress is not 
to expect a lot.  Many people think "Hey, lets tie 8 computers 
together, and we'll go 8 times faster"  Not so - I could go on forever 
about this, just from what I've learned in the first month of class, 
and from our limited experiences with our beowulf.  I'll keep it short 
though.  The way you talk about speed of a parallel system is to 
compare it to a uniprocessor system... "Speedup" is defined as speed of 
a uniprocessor divided by speed of the parallel machine.  For an N 
processor machine, the maximum possible speedup is N.  You wont get 
that in real life.  A simple model is Amdahl's Law, which says the 
following:

Assume that for a given program, x% can be run only on one processor.  
Also assume that the rest of the job is entirely parallelizable.  
Therefore (1 - x)% of the job can be run on all N processors.  The 
speedup in this case is:
N / ( 1 + (N - 1)x)

If you plot that for small values of x, you'll get some surprising 
results:

For a 64 processor machine:
x = 1% (99% is fully parallelizable): Speedup = 39
x = 5% (95% is parallelizable): Speedup = 15.4

One more example - a large supercomputer, 1024 processors:
x = 1%: speedup = 91.8
x = 2%: 47.7
x = 8%: 12.4!!!

And that doesn't even take in to account the communications latencies 
which would be common on a beowulf-class system.

To give a real-world example: POV-Ray.  There exists a version of 
povray that includes PVM support.  At the time we ran it, only 7 of the 
8 systems were up an running, so our theoretical max speedup was 7.  We 
rendered the standard benchmark image, at 640x480, on one computer, in 
1:42 = 102 seconds.  We then re-did the image, running on 7 computers - 
it took 22 seconds.  102/22 = a speedup of 4.6.  Certainly not bad, but 
probably not what you'd expect either, if you go in with the "8 
computers = 8 times faster" attitude.
  
Anyway, I've rambled on enough... I'd be happy to answer any questions 
about our experiences, or whatever... either on the list or via email.

Web sites to check out:
http://www.beowulf.org
http://www.beowulf-underground.org
http://galaxy.acm.jhu.edu/ (shameless plug :)


Scott Lipcon

-
Subcription/unsubscription/info requests: send e-mail with subject of
"subscribe", "unsubscribe", or "info" to discuss-request at blu.org

Follow-Ups:
- Souper Computer
  - From: jrv at vanzandt.mv.com (James R. Van Zandt)
- Souper Computer
  - From: jrv at vanzandt.mv.com (James R. Van Zandt)

References:
- Souper Computer
  - From: ccb at kukla.tiac.net (Charles C. Bennett, Jr.)

Prev by Date: Souper Computer
Next by Date: Souper Computer
Previous by thread: Souper Computer
Next by thread: Souper Computer
Index(es):
- Date
- Thread

Boston Linux & Unix / webmaster@blu.org