Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On Wed, May 19, 2010 at 10:32 AM, Richard Pieri <richard.pieri-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> wrote: >> Caching won't help me if I only want to look at each chunk once. ?If >> the data was in the file sequentially then >> the built-in kernel readahead would help. ?If the file format is fixed >> and I want to process the data in some other order >> then sequential then the simplistic kernel readhead isn't going to >> help (and may make things slower). > > Yeah... see... the problem now is the file storage format. ?What you really want now is an index into the actual data: find what you want from the index and use that pointer to jump immediately to the data you want instead of having to seek across Ghu knows how much file. ?As I said, this has been solved before. Err, how do you "jump immediately to the data" without "having to seek"? The only way I know to "jump immediately.." via Linux/POSIX APIs is explicitly with lseek() (or implicitly with pread()). lseek() is cheap since all the kernel has to do is change it's internal offset counter for the file descriptor associated with a disk file. It's only when you do the subsequent read() that any real cost is incurred. Assuming uncached disk files, that is likely to require disk head seeks which is where the time cost comes into play and I see no way around that. Bill Bogstad
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |