Introduction to BorgBackup

What is BorgBackup ?

BorgBackup is a de-duplication backup tool with support for compression and authenticated encryption. The same ‘binary’ borg acts both as client and server. Data transfer is unidirectional: client to server only. De-duplication/compression/encryption are done on the client side.

borg instances communicate via a specific protocol encapsulate in SSH. Unfortunately there is no support for other protocols (neither SFTP or rsync). Like rsync it’s possible to restrict the server-side instance to a specific directory via a command restriction.

borg save files by making ‘chunk’ of data. Chunks are stored inside ‘repository’, which must be initialized before usage. Backups, called ‘archives’ in borg-lingo, are repository specific.

Client’s configuration

As de-duplication/compression/encryption is highly CPU dependent, it’s better to only use what you really need. For example, encryption isn’t strictly necessary if you have a VLAN dedicated for backups and your storage servers aren’t accessible from the internet.

Idem for compression, if you use on the storage server a filesystem with native compression capabilities, like ZFS. For network transfer, a light compression algorithm type lz4 (or zlib=4) is good enough, and don’t introduce a lot of overhead.

Protocol limits

Borg’s communication protocol is quite limited. Unlike the rsync protocol, there is no notion of ‘label’. Full path must be specified in client-side configuration. Storage servers must respect the same naming conventions and paths.

More annoying: there isn’t any native mechanism to limit the number of borg instance. You must rely on ‘dirty’ solution, like calling a wrapper like this one inside the command restriction:

#!/bin/bash
INSTANCE_LIMIT=30
INSTANCE_NB=$(pgrep borg-bin | wc -l)
BORG_COMMAND=$1

if [ "$INSTANCE_NB" -gt "$INSTANCE_LIMIT" ]; then
       exit 5
else
       borg-bin $*
fi

Note that you can’t know what the borg client is doing (creation of repository, archives, etc…) on the server side. Any backup solution based on a push model must take this limit into account.

Backup solution

To my knowledge there is only one backup solution built on borg: BorgCube.

I haven’t tested the product but by the author’s own admission it’s not production ready yet. Note that BorgCube uses the pull model (the server connects to the client via SSH and runs the borg command with the right parameters) in order to solve the ‘concurrency problem’.