Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] LVM Re: A really interesting chain of functionality



On Mon, Sep 26, 2011 at 12:15 PM, Rich Braun <richb at pioneer.ci.net> wrote:
> The open-source LVM manager in Linux provides excellent _read_ performance.
> Where it suffers relative to commercial products (NetApp, Isilon, et al) is
> the _write_ performance.
>
> In this thread, a criticism is leveled that it eats up disk space. ?Well, if
> you were to allocate 2x the storage of your runtime volume, you'd never run
> out of space on a given snapshot. ?With 2TB drives dropping under $100 these
> days, I hardly see that space is much of a criterion when planning to use LVM
> or not. ?If you want to create a lot of active snapshots, then this might be a
> consideration.
>
> Each active snapshot drops write performance due to the copy-on-write
> implementation. ?(I'm not sure why the open-source product persists in this
> requirement, perhaps there are no active developers looking into this
> problem--there are other ways to attack this problem which would provide
> better performance. ?Future versions of LVM will someday drop the
> copy-on-write implementation.)

If they do, I don't see how they can avoid creating a potential
copy-on-snapshot-delete write storm.   Any snapshot implementation is
going to require two different blocks on disk for every block written
while the snapshot exists.  (i.e. the original contents and the new
contents of each virtual block which has been written)   As I
understand it, LVM uses the original location for new versions of each
block while the original contents of that location are written to
newly allocated blocks.  Thus you get two writes (and a read) the
first time any block is written.   The opposite approach (new contents
are written to newly allocated blocks) only require a single write.
(I'm ignoring IO generated keeping track of any snapshot related
meta-data.)   The problem occurs when you go to delete a snapshot.
With LVM, you just deallocate the storage where the old data was
copied too and do some meta-data cleanup.   With the alternative
approach, deleting a snapshot is more complicated.  Assuming that you
want to actually release the storage where all of the new data was
written while a snapshot was turned on, you have to copy all of that
data back to the corresponding locations in the originally allocated
space.  (i.e. a read and a write.)    So either design requires the
same total number of IO operations, it just changes when they occur.
Admittedly with the new data goes to new location design, you could
mitigate the write-storm by doing a slow copy when you delete a
snapshot; but you can't make those IOs go away.  Of course, this would
also mean you don't get to reuse the space allocated to the snapshot
until your slow copy process completes.   Everything gets reversed if
instead of deleting a snapshot you want to revert to it.   Plus you
get different sequential read performance for current vs. snapshot
versions of files depending on which basic design is used and whether
the file has been partially/fully rewritten since the snapshot
started.

In summary, I don't see a free lunch anywhere.   It really depends on
how your applications work and how you use snapshots which would be
the preferred design.   A system which adapts to your usage patterns
(or at least lets you tell it which basic method to use) is probably
the best that can be done.  Hopefully, if LVM ever implements the
alternative design they will make it an option settable by the storage
administrator.

Bill Bogstad



BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org