Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
David Kramer wrote: | On Fri, 19 Jan 2001, Seth Gordon wrote: | > > ...The recent | > > description of unix's file-linking scheme as "strange" is an example | > > of how even experienced unix users and programmers don't always | > > understand the reasons behind the design. The unix fork+exec | > > scheme is another. | > | > Can you expand on this? What do other OSs do for spawning a process | > that don't fit the fork+exec model, what are the consequences of those | > alternative techniques, and what problems does fork+exec solve? | | Nobody bit on this one, so I will attempt to answer the question, though | it is outside any areas of expertise I pretend to have. Hmmm .. I seem to have been neglectful here, so maybe I'll add to David's comments. One of the significant illustrations of the use of the unix fork is in the way that the apache server handles incoming requests. Part of the httpd.conf file is the number of children to create. Apache forks N times, and these are all copies of the same program. There's an immediate efficiency gain here over the "spawn" paradigm implemented by most other systems. When apache forks, the children don't need to do any initialization at all. The parent did that, and the children inherit all the parent's data unchanged. So the children know everything the parent did. Startup for programs like this can be significant. Doing it only once is a major performance improvement. Also, the children all inherit the parent's open files. In this case, the significant open file is the socket that the parent is listening on. After the forks, all the children are also listening on the same shared file. This file isn't replicated; it is a single file that is open in all the forked processes. When a connection comes in, it goes to the first of these apaches that does an accept() call. Since HTTP requests are all independent and web servers don't maintain state, this works perfectly. If there's an idle server, the client gets an instant connection. If there is no idle server, the first server to complete its current task will do an accept() and get the connection. This sort of sharing of incoming requests is difficult to implement with anywhere near this fast response time on systems with a different process model. There is a major memory saving possible here, too. On most hardware, linux and other unix-like systems now implement "store on write" for the data of forked processes. So when a process forks, not only the code but all the global data is shared. If one modifies a global datum, that memory block is copied for that process. But the global data set up before the fork doesn't need to be copied until it is modified. This is easy to implement (if the hardware supports it) with the unix fork mechanism. It is very difficult to implement with a "spawn" approach, because it's difficult to discover that memory is identical and can be shared. Of course, the primary example of the fork+exec scheme is its use to implement file redirection and pipelines within the various shells. This only takes a few lines of C on a unix-like system. On more other systems, it is much more difficult. It typically entails having the command interpreter pass a whole lot of extra information to a spawned program, and then the startup code for that program has to understand what was passed and implement it correctly. It is very difficult to get the implementers of various compilers and interpreters to go along with this and do it in a consistent fashion. With the fork+exec approach, the code is in the command interpreter, and the new processes don't see it, so it works with all programs no matter what language they are written in. This is a lot of the explanation for the way that wildcard characters varied so wildly in DOS. With Windows, the command-line interface was pretty much abandoned, of course, and wildcard expansion, when it is implemented at all, is done differently for nearly every program. We see a bit of this in unix GUI tools, too, though the glob(3) C library routine and the perl glob() function are there to encourage consistency. - Subcription/unsubscription/info requests: send e-mail with "subscribe", "unsubscribe", or "info" on the first line of the message body to discuss-request at blu.org (Subject line is ignored).
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |