[Discuss] Whence distributed operating systems?

Fri Apr 29 14:33:54 EDT 2016

On Thu, Apr 28, 2016 at 10:41 PM, Rich Pieri <richard.pieri at gmail.com> wrote:
> On 4/28/2016 8:30 PM, Bill Bogstad wrote:
>> The fact
>> that this was done via a fibre data bus vs. a faster local bus would
>> seem to me to be an implementation detail.  It still sounds like a
>> NUMA with three levels of memory access (CPU local, QBB local, remote
>> QBB).   But all of the memory transparently visible in a program's
>> address space.
>
> You might think that but it wasn't. The fibre bus never actually worked
> well. It was too slow mixed with PCI, with too much latency. It got
> congested and bogged down under even moderate loads.

I still see that as an attempt at NUMA that didn't work because of the
latency problems.  Not really surprising actually.   We all know what
happens when a program/system tries to actually use more memory then
is physically installed (i.e. actually using paging).   The latency
difference between local and remote QBB RAM might not have been as bad
as between RAM and disk; but it sounds like it was more then enough to
make it not work.  I suspect that if they hadn't insisted on making
the differences in latency completely invisible to applications, it is
possible that it would have worked better.   A way for a program to
provide hints might have helped.   Something like pin(address,
length)/unpin(address, length).  I think this would have been better
from a programming perspective then having to explicitly manage moving
data between local RAM and remote RAM/disk. You would still have a
single flat memory space programming model which would always work
(albeit slowly) and then you could throw in judicious pin()/unpin()
function calls to help the OS keep active memory local.

Bill Bogstad