Boston Linux & UNIX was originally founded in 1994 as part of The Boston Computer Society. We meet on the third Wednesday of each month at the Massachusetts Institute of Technology, in Building E51.

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] Distributed file systems



> I think the leaders in this space are glusterfs, and ceph.

I set up my home email server as a pair of LXC instances on top of GlusterFS last year. After almost a year of working more-or-less OK, I ditched it for an old-school design: unison running under cron every 5 minutes.

I found 3 problems trying to make GlusterFS work: it burns nearly 1 second of time per inode on every fopen, it creates tens of thousands of sync-status files in a bushy tree under .glusterfs, and over the course of a year I've had about 3 tough-to-diagnose split-brain situations, one of which went undetected for weeks.

Documentation is scant, performance is poor for many routine operations like rsync, and monitoring tools are nonexistent. Its main benefit is relative ease of setup, and if you're a licensed RHEL user, you can get support.

I'm back to square 1 on distributed/clustering solutions. At home I have notes on moosefs, cephfs and others I've tried.

The 8000-lb gorilla is weighing in on this, with ultimate vendor lock-in: AWS is rolling out file-storage solutions that will be tempting for many enterprises, and costly to move off of once large data sets are in place. My employer is merrily going down that road, with petabytes already stored there. Their latest offering provides a mountable volume, but is missing basics like snapshots, quotas, ACLs and monitoring (and probably always will because those are user space concepts that AWS punts to the user).

-rich



BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org