Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mysql backup quesiton

 Most of what's been said on the MySQL backup thread reflects what I've been 
doing at work the past few months.  I wanted to respond to the original 
writer's point about rsync:  why does MySQL backup take so much more in the 
way of resources than a simply rsync? 

Well the way an rsync works is by evaluating a hierarchy of files, with their 
names, sizes and creation dates.  You could do that with a database only if 
you had the same information available about the objects (databases, tables, 
records) contained within it. 

So you could in theory write a program which evaluates the tables ("SHOW TABLE 
STATUS\G") by Update_time and by Data_length, and backs up just those since 
the last backup.  Maybe someone has already done that but it'd probably be 
hard to find, because its usefulness would be limited to particular sites for 
which an incremental backup is important (i.e. those with large/busy/complex 
databases -- but they have teams of DBAs on staff to write such tools, which 
would by their nature be one-off proprietary things rather than open-source 
tools like rsync). 

What I do is the following: 

* For every production server, I have one or more slave servers replicating in 

* Backups are taken daily (or more often) using mysqldump from the slave 
servers.  (Command line options vary; simplest is --all-databases, some have 
--skipt-opt, --quick, --add-drop-table, etc).  (Why the slave--to minimize 
performance impact on the master.) 

* Snapshots are (about to be, not yet in production because a predecessor was 
uncomfortable with LVM) taken by the LVM methods described earlier in this 
thread.  They have to be taken from the master, not a replicated slave. 

All data in each database is included in each backup, so the amount of disk 
required for doing this is at least 10x the actual raw data in the database: 

* InnoDB tables on master and slave, with ample spare capacity 
* Extra capacity for LVM snapshots 
* Space for at least a couple copies of the dumps 
* Storage in /var/log/mysql for the bin logs on the master server, sufficient 
to keep transaction data for a few days in case replication breaks 

Yes, MySQL is a pig.  When it oinks, feed it ample portions and it'll hum 
along like a dream.  Deny it any resource, and you'll be struggling for long 
hours to fix breakage.  (Yesterday was one such day for me...) 

Fortunately, disk space is quite cheap these days! 

A note about mysqldump vs. snapshots:  mysqldump is more portable across MySQL 
versions, so that's the preferred method if you're only using one backup 
method.  Snapshots provide you, in certain cases, with a faster recovery 
method.  (Not always, though:  sometimes your InnoDB tables can get so huge 
that moving the files around takes longer than a mysqldump restore.) 

Two other hints for those dealing with large MySQL installations:  run Cacti 
(monitoring tool a la MRTG but far better), and include in the CPU usage chart 
the IO wait parameter so you can see how hard the disks are getting hammered 
minute by minute. 

For databases less than a few gigabytes, just do both methods (snapshots and 
mysqldump), with replication across two servers, and be done with it.  It's 
relatively straightforward to set up and doesn't consume more than a hundred 
bucks' worth of disks. 


This message has been scanned for viruses and 
dangerous content by MailScanner, and is 
believed to be clean. 

Discuss mailing list 
[hidden email]

BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!

Boston Linux & Unix /