For the past 18 months, I’ve been meaning to review the way I go about doing backups. Whilst I’m quite settled on backing up workstations, until about two weeks ago I had a half-configured, non-recoverable solution which was only setup on one out of several servers. That’s not very useful.
As well as the obvious requirement for something automated, I also wanted something
which could allow me to reuse much of the backup definitions across multiple servers
but also allowed me to avoid writing bash scripts. I’d come across the
Ruby gem some time ago and after some quick testing, this seemed to work
Then, I needed some sort of remote host to backup to. I had heard of Rsync.net
through the prgmr mailing list. They provide a reasonably cheap remote filesystem
with which you can access over
ssh (and so,
rsync work). On the machine
being backed up, this is great as I don’t need any special tools to upload backups
and for restoring the same applies.
backup gem uses the concepts of “models” to define a “backup”. This consists
of any directories or databases you may wish to backup, along with any processing
you may wish to do with it (compressing, encrypting, etc) and where to store it.
My approach to the models is to create different ones for the different “roles” a system
performs. By default, this includes a “base” role which archives
/var/log (in a multi-user system, I’d probably split out
/home to it’s own
model). Then, additional models are defined for more specific roles. So, to backup
a Wordpress site, a “site” model would first backup the associated database and
then collect up the data in
/var/www/site/ into the one archive.
Each model is set to be compressed using
gzip, upload to Rsync.net and then
notify me if any errors occur. It also keeps the last 5 versions of the backup.
backup handles organising the subsequent archives on the remote quite well. For
each server, I define a directory to collect them in and then
backup sorts each
run by a directory named after the timestamp. Each model is then collected inside.
Much like this:
server_name/ base/ 2013.07.29.21.54.01/ base.tar 2013.07.30.00.30.07/ base.tar
Finally, a cron job is used to automate it to run daily.
Rsync.net is refreshingly simple (a bit like prgmr is, I suppose). After
signing up, you’ll get sent the details needed to access the relevant box, where you
ssh in, and do the basics (change the password, check your quota usage, move
files, create directories, etc.).
I opted for a geographically distributed plan (it’s slightly more expensive, but it is the primary backup method I’m using) and the lowest plan — the amount of data is tiny as it’s mostly text files.
And that’s essentially it. I paid for a year so I’ll be reminded about it next year to go about renewing it.
backup Ruby gem gives you a command line tool which will help generate
the models and run them. But you should just install it as a gem, rather than using
Bundler or anything else.
I did all of this under a specific
backup user. This is configured to allow it to
sudo without asking for a password and not much else. It expects
the backup models and configuration files to exist in
~/Backup/, so this seemed
the best approach.
gem install backup
The documentation suggests using the model generator to get started and that’s pretty much what I did:
backup generate:model -t base --archives --storages='scp' --compressors='gzip' --notifiers='mail'
This will give a rather detailed and well commented example. I started with this
and stripped it down to the bare essentials. If you don’t have one already, it will
also create a template
config.rb, which will contain a similar set of examples.
config.rb can be used to set defaults for the models, so I opted to fill this with
as much as possible.
But, some Real World™ examples are much more useful:
## # Base: Basic Linux backup model. # Archives and compresses: /etc, /var/log, /home, /mail. # Uploads to rsync.net. # # $ backup perform -t base ## Backup::Model.new(:base, 'Basic Linux Model') do # archive archive :etc do |archive| archive.use_sudo archive.add "/etc/" end archive :logs do |archive| archive.use_sudo archive.tar_options '--warning=no-file-changed' archive.add '/var/log' end archive :home do |archive| archive.use_sudo archive.add '/home' # don't backup up the backup data. archive.exclude '/home/backup/Backup/.tmp/' end archive :mail do |archive| archive.use_sudo archive.tar_options '--warning=no-file-changed' archive.add '/var/mail' end # compressor compress_with Gzip # storage store_with SCP # notifier notify_by Mail end
archive blocks are just an abstration over
tar, so you can pass through
options. In this case, I’ve ignored file change warnings for areas which are likely
to not harmfully change whilst the backup is running.
Both the storage and notifier lines assume the configuration has already been made.
If you didn’t have these in
config.rb, it wouldn’t work and you’d need to expand
the line into a block.
## # SCP Storage Type Defaults ## Backup::Storage::SCP.defaults do |server| server.username = "" server.ip = "" server.port = 22 server.path = "~/server_name/" server.keep = 5 end ## # Notifier Defaults ## Backup::Notifier::Mail.defaults do |mail| mail.from = "" mail.to = "" mail.address = "smtp.gmail.com" mail.port = 587 mail.domain = "" mail.user_name = "" mail.password = "" mail.authentication = "plain" mail.encryption = :starttls end # * * * * * * * * * * * * * * * * * * * * # Do Not Edit Below Here. # All Configuration Should Be Made Above. ## # Load all models from the models directory. Dir[File.join(File.dirname(Config.config_file), "models", "*.rb")].each do |model| instance_eval(File.read(model)) end
scp block contains the details for Rsync.net. The mail defaults are
currently set to the values for Gmail’s SMTP server (you’ll need to fill in all of
the other relevant bits). By default, this will notify about all events (successful,
with warnings or a failure). I kept this like this for about two weeks to confirm
it was running correctly.
- The configuration files would be more appropriately stored in
/etc, given it’s designed for Unix-like systems.
- The timestamps are annoying. Unix timestamps or ISO 8601 is far more appropriate than defining another bloody date format.
To automate it, I just have a
crontab entry under the
backup user. This then
runs at 0130 every morning and emails me if necessary.
30 01 * * * backup perform -t base,www >/dev/null
(By default, it writes a lot to
stdout, I’d rather not fill up an unmonitored
inbox with successes…)
I configured this about two weeks before writing up this blog post. It’s been working well since then, and I’ve deployed a similar configuration across at least two other boxes.
I now need to investigate the Chef cookbook and modify it to work similarly to this…