How to best do zillions of little files?

Scott Prive Scott.Prive at storigen.com
Wed Oct 2 10:53:58 EDT 2002


I haven't done a bit of work in this area, but I have read how the embedded and floppy Linux systems work: they conserve space (other filesystem reasons also?) by creating a monolithic file that handles everything, and just create links from the "files" to the file that has everything.

Not sure if this approach applies even partially, but there you go. See one of the issues of Embedded Linux Magazine for more info.

> -----Original Message-----
> From: John Chambers [mailto:jc at trillian.mit.edu]
> Sent: Wednesday, October 02, 2002 10:40 AM
> To: discuss at blu.ORG; discuss at blu.ORG
> Subject: How to best do zillions of little files?
> 
> 
> I have a job that entails on the order of 50 million web pages  (they
> say  this  week  ;-),  a few Kbytes each.  Now, unix file systems are
> generally known to not work that well when you have millions of files
> in  a  single  directory, and the general approach of splitting it up
> into a tree is well known.  But I haven't seen any  good  info  about
> linux  file systems, and the obvious google keywords seem to get lots
> of interesting but irrelevant stuff.
> 
> Anyone know of some good info on this topic for various file systems?
> Is there a generallly-useful directory scheme that makes it work well
> (or at least not too poorly) on all linux file systems?
> 
> There's also the possibility of trying the DB systems, but it'd be  a
> bit  disappointing  to spend months doing this and find that the best
> case is an order of magnitude slower than the  dumb  nested-directory
> approach.   (I've  seen this already so many times that I consider it
> the most likely outcome of storing files as records in a DB.  ;-)
> 
> _______________________________________________
> Discuss mailing list
> Discuss at blu.org
> http://www.blu.org/mailman/listinfo/discuss
> 



More information about the Discuss mailing list