| Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | About BLU |
Most of what's been said on the MySQL backup thread reflects what I've been
doing at work the past few months. I wanted to respond to the original
writer's point about rsync: why does MySQL backup take so much more in the
way of resources than a simply rsync?
Well the way an rsync works is by evaluating a hierarchy of files, with their
names, sizes and creation dates. You could do that with a database only if
you had the same information available about the objects (databases, tables,
records) contained within it.
So you could in theory write a program which evaluates the tables ("SHOW TABLE
STATUS\G") by Update_time and by Data_length, and backs up just those since
the last backup. Maybe someone has already done that but it'd probably be
hard to find, because its usefulness would be limited to particular sites for
which an incremental backup is important (i.e. those with large/busy/complex
databases -- but they have teams of DBAs on staff to write such tools, which
would by their nature be one-off proprietary things rather than open-source
tools like rsync).
What I do is the following:
* For every production server, I have one or more slave servers replicating in
real-time.
* Backups are taken daily (or more often) using mysqldump from the slave
servers. (Command line options vary; simplest is --all-databases, some have
--skipt-opt, --quick, --add-drop-table, etc). (Why the slave--to minimize
performance impact on the master.)
* Snapshots are (about to be, not yet in production because a predecessor was
uncomfortable with LVM) taken by the LVM methods described earlier in this
thread. They have to be taken from the master, not a replicated slave.
All data in each database is included in each backup, so the amount of disk
required for doing this is at least 10x the actual raw data in the database:
* InnoDB tables on master and slave, with ample spare capacity
* Extra capacity for LVM snapshots
* Space for at least a couple copies of the dumps
* Storage in /srv/log/mysql for the bin logs on the master server, sufficient
to keep transaction data for a few days in case replication breaks
Yes, MySQL is a pig. When it oinks, feed it ample portions and it'll hum
along like a dream. Deny it any resource, and you'll be struggling for long
hours to fix breakage. (Yesterday was one such day for me...)
Fortunately, disk space is quite cheap these days!
A note about mysqldump vs. snapshots: mysqldump is more portable across MySQL
versions, so that's the preferred method if you're only using one backup
method. Snapshots provide you, in certain cases, with a faster recovery
method. (Not always, though: sometimes your InnoDB tables can get so huge
that moving the files around takes longer than a mysqldump restore.)
Two other hints for those dealing with large MySQL installations: run Cacti
(monitoring tool a la MRTG but far better), and include in the CPU usage chart
the IO wait parameter so you can see how hard the disks are getting hammered
minute by minute.
For databases less than a few gigabytes, just do both methods (snapshots and
mysqldump), with replication across two servers, and be done with it. It's
relatively straightforward to set up and doesn't consume more than a hundred
bucks' worth of disks.
-rich
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
_______________________________________________
Discuss mailing list
[hidden email]
http://lists.blu.org/mailman/listinfo/discuss