BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] Monitoring the monitor (Re: Versioning File Systems)

Subject: [Discuss] Monitoring the monitor (Re: Versioning File Systems)
From: richb at pioneer.ci.net (Rich Braun)
Date: Mon, 7 May 2012 12:52:53 -0700
In-reply-to: <mailman.341.1336404026.1827.discuss@blu.org>
References: <mailman.341.1336404026.1827.discuss@blu.org>

On Fri, May 4, 2012 at 5:10 PM, MBR <mbr at arlsoft.com> wrote:
> Thompson, Ritchie, etc. settled on the
> philosophy that, to the greatest extent possible, all Unix I/O would be
> treated as nothing more than a stream of bytes...Versioning filesystems
> run very much counter to the traditional Unix design philosophy.

Shankar Viswanathan <shankar.viswan at gmail.com> responded:
> Your reasoning makes a lot of sense in the context of Unix
> philosophy. Thanks for this explanation.

Interesting points but I'll counter with the approach taken by Apollo
Domain/OS.  On that system, a file's content was a stream of bytes like any
other Unix variant, but the O/S also supported object structures which were
tied to the inode and file type.

Without native support by the O/S filesystem, then apps must impose such
structures.  Not necessarily a huge problem, but we've all suffered from lack
of standardization in such things as address-book entries, XML/JSON objects,
PDFs, etc.  Vendors love to make things proprietary and to change them just
ever so slightly from one software rev to keep their competition on their
toes.  The main argument against baking stuff like this into a kernel
filesystem is flexibility amid rapid innovation.  But some of these concepts
are so fundamental that they don't need to be "innovated" ever again, yet
vendors keep making arbitrary changes.

This concept strikes me as orthogonal to the question of versioning, though. 
My career started out on an old system that had a hidden form of versioning
built into it.  When you opened a file for writing (vs. updating) on TOPS-10,
it created a new version of the file on the disk, and marked the existing
version for removal.  That way if there were 1 or more other processes with
the file already open, they could continue reading their now-obsoleted version
without interference from the new process.  Any given version of the file
would be returned to the free block list only after two things happened (1)
the file was marked for removal and (2) the file was closed by all running
processes.

One of the benefits of that approach was that recovering any version of a
file, fully intact, was a trivial exercise so long as its data blocks hadn't
yet been reassigned to a new file (so if you keep the disk less than 90% full,
small files could almost always be recovered).  And it didn't cost any
performance.

The one thing about this approach that might violate the Thompson/Ritchie
philosophy of stream-of-bytes agnosticism is that you need to make a
distinction when you open the file between sequential write vs. block-level
updates.  A database has to use block-level update, which bypasses this
implicit versioning.

A few years later the concept of page-mapping of files (i.e. mapping a file's
contents into virtual memory) became the norm.  That changed the way files are
written.  TOPS-20 introduced this, and opted to go with explicit rather than
implicit versioning.  So a database would write to a specific version rather
than defaulting to the next version.

I've re-keyed a small fraction of my lost data in the past few days.  Ugh.

Perhaps my larger point is this:  filesystems are not, today, integrated
tightly enough with backup tools, and system monitoring is not standardized. 
Ideally a filesystem should have awareness of the backup process (e.g. a
per-file and per-directory flag indicating access by a backup suite), system
monitoring should be set up by default by the distro installer, and backup
suites should have follow those standards.

It's 2012 and these things still aren't standardized sufficiently to give the
user (be they a newbie or an expert) the proper dope-slap we all occasionally
need when we do an incomplete server installation or reconfiguration.

So, it boils down to this recursive question:  how does the monitoring system
get monitored?

-rich

Prev by Date: [Discuss] Most Dangerous Operating System
Next by Date: [Discuss] Most Dangerous Operating System
Previous by thread: [Discuss] Most Dangerous Operating System
Next by thread: [Discuss] off-site backup question
Index(es):
- Date
- Thread

Boston Linux & Unix / webmaster@blu.org