Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Most of what's been said on the MySQL backup thread reflects what I've been doing at work the past few months. I wanted to respond to the original writer's point about rsync: why does MySQL backup take so much more in the way of resources than a simply rsync? Well the way an rsync works is by evaluating a hierarchy of files, with their names, sizes and creation dates. You could do that with a database only if you had the same information available about the objects (databases, tables, records) contained within it. So you could in theory write a program which evaluates the tables ("SHOW TABLE STATUS\G") by Update_time and by Data_length, and backs up just those since the last backup. Maybe someone has already done that but it'd probably be hard to find, because its usefulness would be limited to particular sites for which an incremental backup is important (i.e. those with large/busy/complex databases -- but they have teams of DBAs on staff to write such tools, which would by their nature be one-off proprietary things rather than open-source tools like rsync). What I do is the following: * For every production server, I have one or more slave servers replicating in real-time. * Backups are taken daily (or more often) using mysqldump from the slave servers. (Command line options vary; simplest is --all-databases, some have --skipt-opt, --quick, --add-drop-table, etc). (Why the slave--to minimize performance impact on the master.) * Snapshots are (about to be, not yet in production because a predecessor was uncomfortable with LVM) taken by the LVM methods described earlier in this thread. They have to be taken from the master, not a replicated slave. All data in each database is included in each backup, so the amount of disk required for doing this is at least 10x the actual raw data in the database: * InnoDB tables on master and slave, with ample spare capacity * Extra capacity for LVM snapshots * Space for at least a couple copies of the dumps * Storage in /var/log/mysql for the bin logs on the master server, sufficient to keep transaction data for a few days in case replication breaks Yes, MySQL is a pig. When it oinks, feed it ample portions and it'll hum along like a dream. Deny it any resource, and you'll be struggling for long hours to fix breakage. (Yesterday was one such day for me...) Fortunately, disk space is quite cheap these days! A note about mysqldump vs. snapshots: mysqldump is more portable across MySQL versions, so that's the preferred method if you're only using one backup method. Snapshots provide you, in certain cases, with a faster recovery method. (Not always, though: sometimes your InnoDB tables can get so huge that moving the files around takes longer than a mysqldump restore.) Two other hints for those dealing with large MySQL installations: run Cacti (monitoring tool a la MRTG but far better), and include in the CPU usage chart the IO wait parameter so you can see how hard the disks are getting hammered minute by minute. For databases less than a few gigabytes, just do both methods (snapshots and mysqldump), with replication across two servers, and be done with it. It's relatively straightforward to set up and doesn't consume more than a hundred bucks' worth of disks. -rich -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ Discuss mailing list [hidden email] http://lists.blu.org/mailman/listinfo/discuss
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |