Tech Note: Database Backup Revamp Redux
This one’s going to get knee deep in geekspeak before you know it, so feel free to skip over and treat it as an open thread — my feelings won’t be hurt.
It’s all about the backups. You don’t know how important they are until some piece of critical hardware fails and destroys your data, and your only backup is 6 months old. Or you don’t even have a backup. If it’s already happened to you, you have no excuse when it happens again.
Our latest system is fully automated, and makes incremental backups of both the MySQL database and the web server that runs our LGF Blog Engine. The backups are transferred from these two servers to the local office machine, through the wonders of rsync over SSH; the web server is backed up every hour, and the MySQL database once a day. With Apple’s Time Machine backing up the local computer, we have archives going back months—a multi-level disaster recovery system, administered remotely through an ultra-secure connection. Oh yeah.
The database server is backed up less frequently because the backup has to be created first on the server. For maximum portability and ease of restoring, I’m using the mysqldump program to dump out SQL text files for each table in our main database. This process is in a shell script that runs as a daily cron job at low traffic times, because it can take a little while (about 5 minutes for the whole DB) and it uses server resources while dumping the tables.
The Unix rsync program, which runs on the local Mac, is just made for jobs like this; it does a smart incremental backup, and only transfers the changed sections of files—and even compresses them during the transfer. The hourly backup of the entire web server takes only a few seconds and hardly any server resources, because there aren’t many files that change frequently (and the ones that do are usually small).
On the local Mac, I decided to avoid using cron for two reasons: 1) Apple has deprecated it, in favor of their launchd utility, and 2) cron jobs run in a very limited shell environment, and don’t inherit the proper SSH variables, making rsync connections more difficult.
So I’m taking Apple’s advice and using two launchd agents instead, to call the rsync commands; one that runs every hour to back up the web server, and one that runs daily for the MySQL database. Launchd agents run in a full Unix environment, and if you have SSH keys set up for remote access you can use them in your agent. (A new feature of Leopard: built-in support for ssh-agent.)
Creating the launchd agent was simple; it’s an XML “plist” file, and you can either create it by hand if you’re manly enough, or use the free utility Lingon and save some time.
References:
launchd in Depth
MacDevCenter.com — Inside SSH, Part 1