[Discuss] File Systems vs.Databases Was: Re: SSH options (was Future of X11 (was Trying to connect to internet in Debian))

Kent Borg kentborg at borg.org
Fri Jan 23 09:23:32 EST 2026


Not that anyone cares or doesn't know, but…

Files systems seem simple (store blobs of arbitrary, opaque 
data--files--that have arbitrary text strings for names of limited 
length, and are usually organized in a hierarchy, plus a few other 
features) but they are complicated, are vital to be reliable, and hard 
to get right.

Databases care very much care about the data they store, care deeply 
about the "naming" of the data, usually offer complicated ways of 
organizing data, offer a more complicates set of features than do file 
systems, are vital to be reliable, and hard to get right.

They are different. A common case is to use them together. Store the 
metadata in a database (title, genre, date, running time, director, 
actors, screenwriter, MPEG path). And index most of that metadata to 
make it easy to search for things. But the actually fundamental data 
itself, the stuff we really care about (in this example MPEG data) is 
probably not going to be stored in the database, but will be opaque 
blobs---files--hashed into a directory path, stored in a file system.

Use the database for the stuff it is good at, use the file system for 
the stuff it is good at. Appreciate the difference.

Even if one gets into new "AI" stuff where the database might know much 
more about the data and can search on lots of fuzzy internal details, 
this extra knowledge is still just a kind of indexing, and the 
fundamental data will still be stored out in files.

In the case of e-mail there is naturally a bunch of structured metadata, 
and it is very suited to store in a database. There might be interesting 
indexing on the contents of e-mail, but the bodies of the messages 
(which might range from several bytes long to megabytes long and might 
be any kind of text-represented data) are really well suited to live in 
files. Maybe the e-mail system wants to digest various standards for 
attachments, but those are even more suited to be stored in files.

Different file systems will have different features, different databases 
will, too. In some cases the two will blur into each other, but they 
should be thought about differently as one chooses how to use one of 
another in any design.


Don't underestimate how hard a file system is to make. My wife's work 
uses Google web apps but they might switch to MS's competing products 
(horrors). One of the complaints is Google can't reliably store "files", 
they move around and get lost. Maybe Google stores the files themselves 
as blobs out in a file system, but all the metadata about the file, 
including the simulated "location" that is presented to the user, is 
being stored in a database. And it is hard to get that right. 
(Particularly in this world of continuous integration/continuous 
deployment, that worships feature velocity, is not designed before it is 
built, and fired the QA department.)


-kb, the Kent who will shut up now.



More information about the Discuss mailing list