Was Moore's law, now something else, parallelism

Wed Jul 14 07:48:19 EDT 2010

On 07/13/2010 11:02 PM, discuss-request-mNDKBlG2WHs at public.gmane.org wrote:
> Date: Tue, 13 Jul 2010 15:21:42 -0400
> From: Jerry Feldman<gaf-mNDKBlG2WHs at public.gmane.org>
> Subject: Re: Was Moore's law,  now something else, parallelism
> To:discuss-mNDKBlG2WHs at public.gmane.org
> Message-ID:<4C3CBCC6.5090207-mNDKBlG2WHs at public.gmane.org>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On 07/13/2010 02:40 PM, Mark Woodward wrote:
>    
>> >  On 07/13/2010 10:28 AM, Edward Ned Harvey wrote:
>> >  
>>      
>>>> >>>  From:discuss-bounces-mNDKBlG2WHs at public.gmane.org  [mailto:discuss-bounces-mNDKBlG2WHs at public.gmane.org] On
>>>> >>>  Behalf Of Mark Woodward
>>>> >>>
>>>> >>>  Maybe its an OS issue? Like the RAID process described above, operating
>>>> >>>  systems have to become far more modular and parallel to benefit.
>>>> >>>  
>>>> >>>  
>>>>          
>>> >>  You mean more modular and parallel than what?  They already support SMP (for
>>> >>  years since) and virtualization and hyperthreading...  How do you mean more
>>> >>  modular?
>>> >>  
>>> >>  
>>>        
>> >  This is a complex discussion, to keep it simple I will use several vague
>> >  generalities, so understand that I know there are threads used by the
>> >  Linux kernel and there are no true absolutes here. That said, the
>> >  problem with monolithic kernel design is that they are designed
>> >  primarily and conceptually as an API library for applications. An
>> >  application execs a kernel call, the function jumps into kernel code,
>> >  executes the API code, and returns a value.
>> >
>> >  In the multiple processor paradigm, a task will place its request in a
>> >  message queue which will be read by a user space process, that process
>> >  will execute the code, and place the result in the message. On a highly
>> >  parallel system multiple CPUs/threads will be executing.
>> >
>> >  While there is more work in the second example, the overall throughput
>> >  of the second example will be higher in a highly parallel system.
>> >  
>>      
>>> >>  
>>> >>  
>>>        
>>>> >>>  That
>>>> >>>  whole user-space micro-kernel process stuff doesn't sound so useless
>>>> >>>  now.  Monolithic/Modular kernels ruled as CPU cores were scarce.
>>>> >>>  
>>>> >>>  
>>>>          
>>> >>  They still rule now.  Is there some alternative that doesn't use a
>>> >>  monolithic or modular kernel?  I don't know of any other option...
>>> >>  
>>> >>  
>>>        
>> >  The point I'm making is that monolithic kernels will not perform as well
>> >  as micro-kernel designs when there are many many CPUs.
>> >  
>> >
>> >
>>      
> This is not entirely the case. The monolithic vs. micro kernel argument
> has been around for many years. But, the central issues surround the
> kernel data structures, such as memory, cache, I/O, and more. Some
> system architectures allow the administrator to dedicate specific CPUs
> and memory to an OS so that you can have multiple different operating
> systems on the same physical system. But, when you get into the sharing
> of resources, you run into the classic bottlenecks that exist whether
> you have a monolithic kernel or a micro kernel. Years ago at a company
> we were having discussions whether to (1) use Unix, (2) use QNX, or (3)
> do it ourselves. QNX is essentially the type of OS that uses a micro
> kernel with message passing. I have not looked at its architecture for
> years.
>
> Additionally, many things have complicated the kernel architecture. Bill
> (or someone else) mentioned SMP. But also very important in the server
> space is NuMA. Essentially you may have memory that is on the board as a
> physical CPU in a multi-cpu system, but how does the kernel manage what
> goes to that memory.
>
> Basically, the nature of the processing can determine whether a
> monolithic kernel performs better than a mico-kernel.  It all comes down
> to resource management, and in multi-CPU, multi-memory you have a very
> complex system, then add the cache.
>    
I agree with you 100% The point I was trying to make, I guess, is that a 
many CPU based kernel will look a lot more like a NuMA micro kernel than 
it does a monolithic kernel.

A monolithic kernel with SMP works to manage multiple CPUs against a 
conceptually single resource pool. A heavily parallel kernel will need 
to divide up the resources and assign them to one or a subset of 
different CPUs in the system to avoid and reduce resource conflicts and 
false parallelism. Its like processor affinity settings on steroids.

Hardware may have to change as well, with one address bus and one data 
bus, memory will be a bottleneck. Even with in-CPU caching, there will 
still be contention for memory. I/O is always a disaster, but, as more 
and more CPUs will be fighting for access, it will slow down.

The question is whether memory and I/O can be sufficiently managed for 
tens or hundreds of cores. it may be that the current technology for 
CPUs is at an evolutionary dead end.