Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] lvm snapshot cloning



> On Oct 25, 2011, at 11:20 PM, markw at mohawksoft.com wrote:
>>
>> Actually, in LVM 'a' and 'b' are completely independent, each have their
>> own copy of the COW data. So, if 'a' gets nuked, 'b' is fine, and vice
>> virca.
>
> Rather, this is how LVM works *because* of this situation.  If LVM
> supported snapshots of snapshots then it'd be trivially easy to shoot
> yourself in the foot.

Actually, I'm currently working on a system that snapshots of snapshots,
its not LVM, obviously, but it quietly resolves an interior copy being
removed or failing. Its a very "enterprise" system.

>
>> If you know the behaviour of your system, you could allocate a large
>> percentage of the existing volume size (even 100%) and mitigate any
>> risk.
>> You would get your snapshot quickly and still have full backing.
>
> So for your hypothetical 1TB disk, let's assume that you actually have 1TB
> of data on it.  You would need two more 1TB disks for each of the two
> snapshots.  This would be unscalable to my 25TB compute server.  I would
> need another 25TB+ to implement your scheme.  This is a case where I can
> agree that yes, it is possible to coerce LVM into doing it but that
> doesn't make it useful.

Well, we all know that disks do not change 100% very quickly or at all.
Its typically a very small percentage per day, even on active systems.

So the process is to backup diffs using analysis of two snapshots. A start
point and an end point. Just keep recycling the start point.

>
>
>> In a disaster, well, brute force will save the day.
>
> My systems work for both individual files and total disaster.  I've proven
> it.

Yes, backups that maintain data integrity work. That's sort of the job.
The issue is reducing the amount of data that needs to be moved each time.

With a block level backup, you move only the blocks. With a file level
backup you move the whole files. Now, if the files are small, a file level
backup will make sense. If the files are large, like VMs or databases, a
block level backup makes sense.


>
>>> don't need to do any of this and users can go back in time just by
>>> looking
>>> in .clone in their home directories.  I still have nightly backups to
>>> tape
>>> for long-term archives.
>>
>> Seems complicated.
>
> It isn't.  It's a single AFS command to do the nightly snapshot and a
> second to run the nightly backup against that snapshot.
>
>
>
>> totally wrong!!!
>>
>> lvcreate -s -n disaster -L1024G /dev/vg0/phddata
>> (my utility)
>> lvclonesnapshot /dev/mapper/vg0-phdprev-cow /dev/vg0/disaster
>>
>> This will apply historical changes to the /dev/vg0/disaster, the volume
>> may then be used to restore data.
>
> Wait... wait... so you're saying that in order to restore some files I
> need to recreate the "disaster" volume, restore to it, and then I can copy
> files back over to the real volume?

I can't tell from the snip the whole example, but I think I was saying
that I could clone a snapshot, apply historical blocks to it, and then
you'd be able to get a specific version of a file from it. Yes.

If you are backing up many small files, rsync works well. If you are
backing up VMs, databases, or iSCSI targets, a block level strategy works
better.
>
>
>> You have a similar issue with file system backups. You have to find the
>> last time a particular file was backed up.
>
>> Yes, and it should be MUCH faster!!! I agree.
>
> *Snrk*.  Neither of these are true.  I don't have to "find" anything.  I
> pick a point in time between the first backup and the most recent,
> inclusive, and restore whatever I need.  Everything is there.  I don't
> even need to look for tapes; TSM does all that for me.  400GB/hour
> sustained restore throughput is quite good for a network backup system.

400GB/hour? I doubt that number, but ok. It is still close to three hours.

>
> Remember, ease of restoration is the most important component of a backup
> system.  Yours, apparently, fails to deliver.

Not really, we have been discussing technology. We haven't even been
discussing user facing stuff.

The difference is what you plan to do, I guess. I'm not backing up many
small files.

Think of it this way. A 2TB drive is less than $80 and about $0.25 a month
in power. The economies open up a number of possibilities.




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org