Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On Tue, May 18, 2010 at 10:20 AM, Richard Pieri <richard.pieri-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> wrote: > On May 18, 2010, at 8:58 AM, Mark Woodward wrote: >> >> Wait, even from a pedantic perspective, asynchronous I/O is the ability it issue disk I/O requests without blocking the process or thread. I am merely attempting to use this ability to optimize a particular type of operation on a file. > > No, you're not. ?Really. ?Try this: open() a file handle with the async I/O ?option then try to read() and see what happens. ?Experiment, because the results are not what you seem to expect. He already knows that he can't do what he wants with the standard read() function. He is trying to determine if some other function will work. > >> With tagged queuing on SATA and SCSI before it, a driver is able to issue multiple requests simultaneously to the device and the device is supposed to be able to get requested blocks in cache and return them over the device I/O bus. > > This is concurrent I/O. Can we stop harping on what to call what he wants to accomplish? Fine, we'll call it concurrent IO. So how does he do the equivalent (submit multiple requests in one operation) from a Linux/UNIX application? I'm guessing that he wants to interleave the IO and computation time in his app and therefore wants to submit the requests in a non-blocking fashion and either receive an event or poll for completion status. He also doesn't want to open the file multiple times which causes a problem since a file descriptor can only have one lseek() position at a time. I can imagine scenarios (for example a library might be passed in an already opened FD) which would make it helpful to be able to do this. I've heard no comment about why using /proc/self/fd/### to reopen the same file is not an acceptable solution to that part of his problem. >> (3) A "few" (4) milliseconds shaved off a function that is run half a million to a million times is between 1/2 hour and an hour of wall clock time. That is important. > > If you stripe across three spindles then you cut your access times by approximately 33% without having to code anything. ?But if you code it anyway then your code will probably run *slower* because you're wasting CPU cycles trying to optimize something with completely different seek timings from what you expect from a single spindle. Err, CPU cycles are practically free compared to disk seeks. That's why disk schedulers implement things like elevator algorithms rather then FIFO: http://en.wikipedia.org/wiki/Elevator_algorithm These algorithms tend to work better with longer queues of requests and he wants to fill the scheduler's queue with pending requests. Think of this as application directed file read ahead rather then a more naive read the next NNKbytes. You are making the assumption that the particular set of requests that his SINGLE (go fast) app want to make happen to have been written by the filesystem so they end up in stripes on different spindles. Striping data works well for high speed sequential file IO, but may have little or no benefit for other access patterns. A good mirrored disk implementation should always speed up reads with any access pattern (but no help for writes). Of course, this is ONLY true if the (go fast) app has a way to efficiently submit multiple requests and it's also helpful if you can interleave your computation with IO time. BTW, in thinking/googling about this; I ran across two interesting system calls which could help: pread(fd, buf, count, offset) - sort of like a combined lseek()/read(). Unfortunately, it is blocking so you would have to have one thread per request and some kind of synchronization with the main computation thread. The advantage is that it does not change the file offset, so multiple threads could use the same FD. readhead(fd, offset, count) - Linux specific way to pre-populate the page cache with data from a file. offset/count are rounded up/down to be page sized chunks. This is a blocking operation so threads would be required. Doesn't affect the file offset so the threads could use the same FD as the computation thread. One approach might be to have the computation thread spawn a new IO thread for each readahead() call. The computation thread would then call the regular read() call when it actually needed the data. If IO interleaving worked the read() would return immediately, if not the computation thread would block until the data was available. If the computation thread can't be allowed to ever block then having the IO thread use pread() and some kind of completion queue for the computation thread to examine might work. If threads are totally unacceptable, I don't see how to do this. I see no way to submit multiple requests in a single system call so if you want (near) simultaneous requests (feed the elevator algorithm) you seem to need one thread per request. A single system call that combined pread() and readv() would be nice. Bill Bogstad
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |