Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Mike writes: | Unix filesystems do all sorts of strange things by design. For example, | you can delete an open file, and then when it is closed and its reference | count drops to zero, it will be purged. This is strange. Well, I wouldn't call this strange; I'd call it a simple and elegant solution to a common set of problems on other computer systems. I've worked on systems for which, if you delete a file, any program that had it open either starts getting garbage as it reads the same blocks that are now part of another file, or errors out is strange ways. The "solution" on some systems has been to have the delete return an error if the file is open. These all lead to very tricky programming problems as programs need to handle the error. This means that even the simplest programs need to be aware of the multiprogramming, if they are to recover gracefully from file deletions by another process. Or, if they don't handle the errors, the disk gets filled up with junk files that didn't get deleted properly because someone had them open. The unix solution was a huge simplification of the logic of it all. A file is a file, even if it no longer has a name in a directory. If process X opens a file, and process Y deletes the file from a directory, neither process has any sort of error condition. No coding is required to recover from the collision, because it's not an error and there are no anomalies. Process X merely continues to read the data from the file, which stays around as a nameless file until process X closes it. This also provides a simple and elegant way to create "scratch" files that don't leave behind relic data if the program bombs. You just create a new file, unlink it, and as long as you keep it open, it's your own file that nobody else can get at. If the program bombs or is killed, the kernel's process cleanup closes it, notes that the link count is now zero, and recycles the disk blocks. All this works without any need for a special "scratch file" flag and complex code to scavenge just that sort of file. Of course, lots of programs don't take advantage of this, and leave behind named junk files. But that's the fault of the programmer, not of the OS, which has tried to make it easy for the programmer. An interesting special case of this is a pipeline of processes, such as is produced by the cc command that triggers a multi-phase chain of subprocesses. Now, cc usually produces scratch files in /tmp or /usr/tmp, and if the compile bombs, garbage files can be left behind. I've written some similar multi-process packages that don't do this. How? I just have the parent process open a set of scratch files, such as files 4 thru 7, and pass them to the subprocesses. The programs can either "just know" that they are to use certain pre-opened files, or you can give them command line options like "-i5 -o7" meaning to input from file 5 and output to file 7. The parent has unlinked all these files, so if the entire flock is killed somehow, the files all get reclaimed automatically, and there's no junk left behind. (It's also handy to say that if the debug flag is turned up above a minimal level, the files aren't unlinked. This way, during debugging you can see all the scratch files, but when you run with the debug flag off, they become invisible.) It's not at all strange, once you understand why it was done this way, and how to take advantage of it. It's all part of why unix software tends to be smaller and more reliable than software on systems whose file systems don't work this way. - Subcription/unsubscription/info requests: send e-mail with "subscribe", "unsubscribe", or "info" on the first line of the message body to discuss-request at blu.org (Subject line is ignored).
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |