[Discuss] Thin Provisioned LVM

markw at mohawksoft.com markw at mohawksoft.com
Thu Mar 12 12:14:09 EDT 2015


> On 3/12/2015 8:46 AM, markw at mohawksoft.com wrote:
>> (1) If someone could point me in the direction of documentation on how
>> to
>> get ZFS to update file or zvol blocks IN PLACE, i.e. without going
>> through
>> the ZIL, then cool, I would really find that helpful.
>
> See, this is what Ned is on about. There are two things that you've
> written here that demonstrate a significant lack of understanding of ZFS.

NO, I understand this, really I do.
>
> First is the ZIL. ZFS always has a ZIL. On a simple system the ZIL is on
> the data vdevs. In a high performance pool the ZIL is a dedicated
> low-latency device like a RAM-based SSD (optimally a mirrored pair). But
> regardless, there's always a ZIL.

Exactly my point, by the way. I don't want ZIL for some applications. It
isn't a misunderstanding, I've looked over the code intensely looking for
some way to provide this functionality.

>
> Second is that you don't tell ZFS to update in place. That's not how one
> does things with ZFS.

Yes, I know this. Disagreeing with the way ZFS implements storage is not
the same as misunderstanding it.

> The ZFS way is to enable deduplication and
> compression. I *DID* point you at these and I explicitly called out
> deduplication as the solution to the rampant space gobbling problem that
> you described. You chose to brush all of it off as "ZFS is stupid".
>
> No, it isn't.

I think you misunderstood what I was saying about space utilization.
Consider this: You are a large cloud hosting company. You have a SAN
storage system from which you allocate thin provisioned virtual luns which
you then present to ESX server virtual machines. You give each customer a
2T LUN on which to install their OS of choice. The customers are billed by
the actual amount of storage they use. Using a conservative allocation of
disk space and in-place modification, the hosted system doesn't grow on
the LUN.

This is good for two things: (1) It saves the customer money because they
are not paying for storage they are not using. (2) It allows the hosting
company to monitor and budget hardware infrastructure additions gradually.

The problem with ZFS, is that it is very aggressive at growing the pool.
It assumes there is no cost to using the whole disk. Once it writes to a
block, that block is pulled out of the SAN and allocated to the LUN, you
can't give it back in the SAN. The number of "used" blocks have not really
changed on the LUN, only more free space has been allocated to it. Now the
customer has to pay for that and the hosting company has to add more
storage to their SAN.

There is no way I have found to curtail this behavior and everyone just
says "ZFS wants to own the disks." That's not a solution to the problem.


>
>
>> First, on Linux, currently, ZFS does not cluster across multiple
>> systems,
>> so there's one instance. That means you can't create fully redundant
>> applications on Linux using ZFS.
>
> I don't know where you picked up this idea but it's very wrong. I've
> designed, deployed and managed fully redundant HA systems without
> cluster-aware file systems. Cluster-aware file systems are just of
> several solutions to the problem of shared storage.

Fully redundant on linux, i.e. active-active. This is not supported on
Linux as of 3/12/2015. We have an active-passive solution, but that is
half way toward what we want to do.






More information about the Discuss mailing list