Asynchronous File I/O on Linux

Tue May 18 09:46:38 EDT 2010

On 05/18/2010 08:58 AM, Mark Woodward wrote:
>
> A few points:=20
> (1) The disk block I am requesting will never be in cache unless I have=
 requested it.
> (2) A "good database" i.e. not mySQL, something like Oracle, DB2, Postg=
reSQL, etc. do their own caching and manage their own data access. Oracle=
 still has their own device level access system.
> (3) A "few" (4) milliseconds shaved off a function that is run half a m=
illion to a million times is between 1/2 hour and an hour of wall clock t=
ime. That is important.
> (4) If asynchronous I/O is not used, then I will *always" have the wors=
t case scenario of purely sequential reads.
>
>  =20
Mark,
I think that what you want to do is certainly possible, but you are
going to need to use, not only open(2), seek(2), read(2), but also
ioctl(2). You'll probably need to look at the actual driver code to see
exactly what each ioctl does. Without subverting the driver, you could
achieve what you want, but you could also subvert the driver if you want
to. Another thing you may want to do is to look at mmap(2) also.=20
Essentially, for those who might be unfamiliar with mmap(2) you can use
this to system call to map a file into virtual memory.  The downside of
this is that you can get into memory issues, but you will do this with
your disk caching. The nice thing about mmap(2) whether you use it to
map a file, or just manage your memory, is that you allocate on page
boundaries. In contrast, malloc(3) give you small chunks and is uses
various schemes such as the classic brk(2) and sbrk(3) to allocate
virtual memory. I don't know the actual underlying scheme malloc(3) uses
in Linux, but some versions utilize mmap(2) in addition to sbrk(2) and
brk(2). I assume that the files you would be using would be a large
database where an mmap(2) would be impractical, but mmap(2) for your
memory management might be much better than using malloc(2) or sbrk(2).=20
Note that in C++, the new operator is normally layered directly on
malloc(3), so if you are writing in C++, you do have to be a bit careful
that you don't subvert sbrk(2), brk(2). Again a quick explanation.
sbrk(2) allocates virtual memory sequentially from the top of the
process data segment until it hits a barrier. In contrast, mmap(2) can
allocate anywhere in your virtual memory space, including the region
above where your brk(2) is effectively blocking sbrk(2), brk(2) from
allocating any more. I've written code to specifically do this to test
Purify and a new version of malloc(3). In general, there will not be a
conflict between pages allocated by mmap(2) and sbrk(2), but it can happe=
n.

--=20
Jerry Feldman <gaf-mNDKBlG2WHs at public.gmane.org>
Boston Linux and Unix
PGP key id: 537C5846
PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB  CA3B 4607 4319 537C 5846