Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ReiserFS vs XFS or JFS?



 In general, I'm not too fond of "journaling" as a data integrity tool. 
Unless you do sophisticated things like block archive on write and full 
versioning of files, journaling is *only* for file system stability, it 
will not protect any data. 

Upon a power failure, application data in the system file cache is no 
better protected with a journaling system than without. The only advantage 
is file system reconstruction time but the data will still be lost. There 
is no absolute guarantee that a file existing with bogus data (journaling 
file system) is any better than a bogus file being deleted by fsck. The 
advantage of file system journaling is system restart time and 
recoverability. 


> Interesting.  I hadn't thought of it that way. 
> 
> I understand the concept of journaling a filesystem or database, but I 
> haven't looked at the source code to any of the implementations.  This 
> conversation has got me wondering - where is the journaled data 
> typically stored?  It's clear that the more separate you can keep the 
> journal from the thing being journaled, the safer you are in the case of 
> catastrophic failure.  Do any of these journaled filesystems or 
> journaled databases allow you to write the journal to a separate disk 
> from the one that contains the filesystem or database, on the theory 
> that both disks aren't likely to die simultaneously? 
> 
>     Mark 
> 
> [hidden email] wrote: 
>>> If the raw file containing your database is represented inside the 
>>> structures used by a filesystem and something in the filesystem gets 
>>> trashed, having a journaled database isn't going to help at all because 
>>> the journaling information will be inaccessible, since it's also stored 
>>> in the trashed filesystem. 
>>> 
>> 
>> Depending on the database that may be true, but a block level journaling 
>> database like Oracle or PostgreSQL that is likely not true. I know 
>> PostgreSQL better, so I explain using it as an example. 
>> 
>> Assuming an active database, one is which there may  be re-usable space 
>> in 
>> existing blocks and PostgreSQL WAL files are fairly constant in size. 
>> 
>> Assume that the file system will not be obliterated on a power failure. 
>> That it is robust enough (or simple enough) to merely have lost chains. 
>> I 
>> would use something like DOS FAT as a model. Simple to the point of 
>> being 
>> stupid. 
>> 
>> When a database table grows, it grows in fixed sized blocks (like FAT). 
>> After each database write, fsync is called. Far more often than not, 
>> only 
>> the data within a file changes while the disk allocation and file system 
>> information remain constant. 
>> 
>> The meta information and file system never needs to be journaled because 
>> the file's "file system" characteristics change relatively infrequently 
>> and when they do, fsync will be called immediately. There will never be 
>> any real benefit from the file system journal and you'll end up doing 
>> the 
>> same work twice. 
>> 
>> 
>> 
>>> So, if you store your database inside a filesystem, double journaling 
>>> is 
>>> unavoidable if you want your data safe.  The preferable alternative is 
>>> to avoid the structures involved in a filesystem, and allocate an 
>>> entire 
>>> partition to your database.  Then all you have to worry about is if the 
>>> partition table were to get trashed, so make a backup of block 0 of the 
>>> disk. 
>>> 
>>>     Mark R. 
>>> 
>>> [hidden email] wrote: 
>>> 
>>>> IMHO: 
>>>> EXT2 is great for a database journal in that you won't be double 
>>>> journalling. (I often speculate that a very minimal UNIX file system 
>>>> designed for purely for speed and regularly sized blocks, something 
>>>> like 
>>>> a 
>>>> streamlined FAT system, would be awesome for databases.) 
>>>> 
>> 
>> 
>> 
> 


BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org