Introduction to BorgBackup

What is BorgBackup ?

BorgBackup is a de-duplication backup tool with support for compression and authenticated encryption. The same ‘binary’ borg acts both as client and server. Data transfer is unidirectional: client to server only. De-duplication/compression/encryption are done on the client side.

borg instances communicate via a specific protocol encapsulate in SSH. Unfortunately there is no support for other protocols (neither SFTP or rsync). Like rsync it’s possible to restrict the server-side instance to a specific directory via a command restriction.

borg save files by making ‘chunk’ of data. Chunks are stored inside ‘repository’, which must be initialized before usage. Backups, called ‘archives’ in borg-lingo, are repository specific.

Client’s configuration

As de-duplication/compression/encryption is highly CPU dependent, it’s better to only use what you really need. For example, encryption isn’t strictly necessary if you have a VLAN dedicated for backups and your storage servers aren’t accessible from the internet.

Idem for compression, if you use on the storage server a filesystem with native compression capabilities, like ZFS. For network transfer, a light compression algorithm type lz4 (or zlib=4) is good enough, and don’t introduce a lot of overhead.

Protocol limits

Borg’s communication protocol is quite limited. Unlike the rsync protocol, there is no notion of ‘label’. Full path must be specified in client-side configuration. Storage servers must respect the same naming conventions and paths.

More annoying: there isn’t any native mechanism to limit the number of borg instance. You must rely on ‘dirty’ solution, like calling a wrapper like this one inside the command restriction:

#!/bin/bash
INSTANCE_LIMIT=30
INSTANCE_NB=$(pgrep borg-bin | wc -l)
BORG_COMMAND=$1

if [ "$INSTANCE_NB" -gt "$INSTANCE_LIMIT" ]; then
       exit 5
else
       borg-bin $*
fi

Note that you can’t know what the borg client is doing (creation of repository, archives, etc…) on the server side. Any backup solution based on a push model must take this limit into account.

Backup solution

To my knowledge there is only one backup solution built on borg: BorgCube.

I haven’t tested the product but by the author’s own admission it’s not production ready yet. Note that BorgCube uses the pull model (the server connects to the client via SSH and runs the borg command with the right parameters) in order to solve the ‘concurrency problem’.

Rsnapshot

rsnapshot is a utility for making local and remote backup. It is written in perl and use rsync for data transfert, and hard-link for deduplication. You can find more information about it, on the official website.

Restore a machine from a backup

It is very easy under GNU/Linux to restore a system from a full rsync backup. Just follow this procedure:

  • Boot on a liveCD
  • In case of hard-drive change, partition the new HDD using parted
  • Create a temporary directory and mount the root partition of you HDD inside (/dev/sdb1 in this example):
mkdir /mnt/root_hdd
mount -t ext3 /dev/sdb1 /mnt/root_hdd
  • Restore data from your backup:
rsync -av --numeric-ids --delete --exclude='/proc' --exclude='/sys' /media/<mybackup>/ /mnt/root_hdd/

Data from the root partition will be overwritten, and supernumerary files deleted.

Now in case of disk change, we need to take care of the boot loader also:

  • Install grub:
mount --bind /proc /mnt/root_hdd/proc
mount --bind /dev /mnt/root_hdd/dev
mount --bind /sys /mnt/root_hdd/sys
chroot /mnt/root_hdd
grub-install /dev/sdb1

That’s all, reboot and enjoy.