Is the write(2) system call atomic

markw at mohawksoft.com markw at mohawksoft.com
Sat Apr 8 12:50:52 EDT 2006


I hacked up your code to see if I could reproduce the problem. I have a
couple SMP Linux boxes as well as some SMP FreeBSD 6 boxes.

I could not get your results.

However, you shouldn't get the overwrite issues. As I understand it, there
is only one file descriptor in the kernel per physical file. Also, every
database in the world would fail on Linux.

I suspect that there is more to this problem than your example gives (Your
example could not have produced the expected results). If one of the
processes were using seek(), or if the file was opened with "fopen()" that
may explain it.

If this is done:

lseek(fd, 0, SEEK_END);
write(fd, buf, cb);

Then that could surely explain the problem.

>
> There has been some discussion on another list regarding the write(2)
> system
> call on Linux.
> Here is a short code fragment to illustrate where the parent process
> creates
> and opens a file for writing, then forks a child. it was my impression
> that
> the kernel maintains a single structure for each open file. If that is
> true
> then we would see all 6 lines. In most of the failure cases we see that
> the
> first line and part of the second line of the parent are overwritten by
> the
> child.
> Note that I would probably use lockf(3), flock(2), or fcntl(2).
> fd = open(<filename>, O_CREAT | O_WRONLY | O_TRUNC, 0644);
> if ((pid = fork()) > 0) { /* Parent code */
>        sprintf(buf, "Parent first line\n");
>        sprintf(str, "Parent is process %d\n", getpid());
> 	for (i=0; i < 2; ++i) {
>            sprintf(str, "----P=%d: %d\n", getpid(), i);
>            strcat(buf, str);
>  } else if (pid == 0) { /* child */
>        for (i=0; i < 2; ++i) {
>            sprintf(str, "****C=%d: %d\n", getpid(), i);
>            strcat(buf, str);
>        }
> } else assert(0); /* error on fork() */
> assert(write(fd, buf, strlen(buf)) == strlen(buf));
> assert(close(fd) == 0);
> if (pid) /* parent wait for child */
>      wait(NULL);
> return 0;
> }
>
> In the above case, we observed that on an SMP system, 2.6 kernel that in
> some cases the child data overwrote some of the parent data. This should
> create a file containing something like:
> ****C=19585: 0
> ****C=19585: 1
> Parent first line
> Parent is process 19584
> ----P=19584: 0
> ----P=19584: 1
>
> But in the failure case, the following result may be present.
> ****C=18099: 0
> ****C=18099: 1
> ocess 18098
> ----P=18098: 0
> ----P=18098: 1
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.c
Type: text/x-csrc
Size: 795 bytes
Desc: not available
URL: <http://lists.blu.org/pipermail/discuss/attachments/20060408/aec1c2d9/attachment.bin>


More information about the Discuss mailing list