[Discuss] Gluster startup, small-files performance

Wed May 14 13:19:30 EDT 2014

On 05/14/2014 12:52 PM, Richard Pieri wrote:
> F. O. Ozbek wrote:
>> If you lose power to your entire storage cluster, you will lose
>> some data, this is true on almost all filesystems.(including moosefs and
>> glusterfs)
>
> If writes are atomic -- that is, the file system driver and underlying
> hardware honor fsync calls -- then you won't lose anything that's been
> completed because all completed writes have been committed to
> non-volatile storage. You will lose incomplete writes but the clients
> will know it (fsync calls time out or return errors) and can retry or
> invoke error handlers.

We don't disagree there.

>> It is possible to setup a moosefs cluster with redundant metadata
>> servers in separate locations (and chunk servers in separate locations.)
>
> A catastrophic failure at the wrong time (is there ever a right time?)
> will leave you with lost or corrupted data no matter how many parallel
> storage clusters you have.

You are wrong here. What I said was that if you have your redundant
metadata servers in seperate locations (meaning separate data centers.)
and assuming that you didn't have a regional power outage (Northeast?)
you will be fine. In our case, everything will be in the same
data center (with UPS and generators) and we will have nightly
offsite backups. But yes, if the entire power goes down to the entire
room, we will lose the last hour at most.
(moosefs dumps the metadata database
to disk once an hour.) If we want to eliminate that risk,
we can setup a slave metadata server in a separate data room
away from the main site.

>
>
>> This will save you from power outages as long as you don't lose power
>> in all the locations at the same time. Keep in mind these servers
>> are in racks with UPS units and generator backups.
>
> So are Amazon's EC2 racks. Yet:
> http://www.zdnet.com/amazon-web-services-suffers-outage-takes-down-vine-instagram-flipboard-with-it-7000019842/
> http://venturebeat.com/2012/10/23/amazon-ec2-outage-restored/
> http://money.cnn.com/2011/04/21/technology/amazon_server_outage/
>

I think the discussion is losing its focus. If you are a bank
moving billions of dollars around, go make EMC even richer.
If you need a reliable HPC storage cluster that you can actually afford,
use moosefs!