Just another Sys Admin blog… wait really ?

2017-07-12

Introduction to BorgBackup

What is BorgBackup ?

BorgBackup is a de-duplication backup tool with support for compression and authenticated encryption. The same ‘binary’ borg acts both as client and server. Data transfer is unidirectional: client to server only. De-duplication/compression/encryption are done on the client side.

borg instances communicate via a specific protocol encapsulate in SSH. Unfortunately there is no support for other protocols (neither SFTP or rsync). Like rsync it’s possible to restrict the server-side instance to a specific directory via a command restriction.

borg save files by making ‘chunk’ of data. Chunks are stored inside ‘repository’, which must be initialized before usage. Backups, called ‘archives’ in borg-lingo, are repository specific.

Client’s configuration

As de-duplication/compression/encryption is highly CPU dependent, it’s better to only use what you really need. For example, encryption isn’t strictly necessary if you have a VLAN dedicated for backups and your storage servers aren’t accessible from the internet.

Idem for compression, if you use on the storage server a filesystem with native compression capabilities, like ZFS. For network transfer, a light compression algorithm type lz4 (or zlib=4) is good enough, and don’t introduce a lot of overhead.

Protocol limits

Borg’s communication protocol is quite limited. Unlike the rsync protocol, there is no notion of ‘label’. Full path must be specified in client-side configuration. Storage servers must respect the same naming conventions and paths.

More annoying: there isn’t any native mechanism to limit the number of borg instance. You must rely on ‘dirty’ solution, like calling a wrapper like this one inside the command restriction:

#!/bin/bash
INSTANCE_LIMIT=30
INSTANCE_NB=$(pgrep borg-bin | wc -l)
BORG_COMMAND=$1

if [ "$INSTANCE_NB" -gt "$INSTANCE_LIMIT" ]; then
       exit 5
else
       borg-bin $*
fi

Note that you can’t know what the borg client is doing (creation of repository, archives, etc…) on the server side. Any backup solution based on a push model must take this limit into account.

Backup solution

To my knowledge there is only one backup solution built on borg: BorgCube.

I haven’t tested the product but by the author’s own admission it’s not production ready yet. Note that BorgCube uses the pull model (the server connects to the client via SSH and runs the borg command with the right parameters) in order to solve the ‘concurrency problem’.

2017-02-192017-04-19

[MySQL] Logging Deadlock errors

Since mysql 5.6.2 it possible to log deadlock error without using additional tool like pt-deadlock-logger

To activate this function, add inside your my.cnf:

# log deadlock in error log. default = /var/lib/mysql/$(hostname -s).err
innodb_print_all_deadlocks = 1

To enable it without restarting:

# myadm -e "set global innodb_print_all_deadlocks=1;"

Further Reading and sources

https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_print_all_deadlocks

2017-01-142017-02-14

[Varnish] Varnish book for version 4.x

To improve your skills in varnish configuration, you can now browse and download the Varnish Book in version 4. Enjoy.

2016-12-022017-02-20

[Varnish] 4.x Hit / Miss header

With version 4.x the classic snippet:

sub vcl_deliver {
        if (obj.hits > 0) {
                set resp.http.X-Cache = "HIT";
        } else {
                set resp.http.X-Cache = "MISS";
        }
}

doesn’t work anymore, because obj.hits no longer reports the number of hits for a cached object. In order to add a custom header indicating object origin, you must using a little trick with the X-Varnish header value.

A non-cachable or freshly added object will have only one hash. An object retrieved from the cache several hashed. This snippet use this property:

sub vcl_deliver {
  if (resp.http.X-Varnish ~ "[0-9]+ +[0-9]+") {
    set resp.http.X-Cache = "HIT";
  } else {
    set resp.http.X-Cache = "MISS";
  }
}

2016-11-092017-02-09

[Varnish] 4.x HTTP Authentication

With the 4.x branch, the vcl syntax have change significantly, rendering this post obsolete.

Now the vcl_recv block, must look like this:

if (!req.http.Authorization ~ "Basic XXXXXXXXXXXXX"){
   return(synth(401, "Authentication required"));
}

Also vcl_error doesn’t exist anymore. It has been splitted into vcl_backend_error for as it name implies and vcl_synth for all “custom-made” error code. The block to add into vcl_synth look like this:

if (resp.status == 401) {
   set resp.status = 401;
   set resp.http.WWW-Authenticate = "Basic";
   return(deliver);
}

As usual you can add your own HTML code inside this function.

2016-10-272017-02-20

[Varnish] 4.x health command

With previous varnish version, the “undocumented-command-to-know” to check backend health was:

varnishadm debug.health

A return code 200 was great and any other value bad. Problem: this command basically give no details on the probe setting and it success rate.

With version 4.x varnish the command has changed. Now it’s:

varnishadm backend.list

and this time it not only give you a list of each backend but also the associated probe success rate.

2016-07-212017-03-21

[Varnish] varnishstats

varnishstat is a tool used to monitor the basic health of Varnish. Unlike others it doesn’t read log entries, but displays statistics from a running varnishd instance. It can be used to determine request rate, memory and thread usage.

Output

Like top the command output is in real-time. Data are displayed in a table form. The first column is the raw data of the counter. For example, in case of the ‘cache_hit’ counter, this is the total number of cache hits since varnishd was started. The second column is the counter change per second. The third is the average change per second since varnishd was started. The three next are the same except with larger time scale (10, 100 and 1000 seconds respectively).

Note that you can use the option -1 for non ‘interactive’ use. In this case varnishstat list all stats and quit immediately.

Interesting counters

There is a lot of counters to look after, but keys metrics can be divided into 4 categories:

Client: client connections and requests
Cache: cache hits, evictions
Thread: thread creation, failures, queues
Backend: success, failure and health

Client metrics

client_req: this counter display the number of requests you’re receiving per unit of time. Monitoring this metric can alert you of spikes in incoming web traffic, whether legitimate or nefarious.

sess_dropped: once Varnish is out of worker threads, each new request is queued up and this counter is incremented. When the queue is full, new incoming requests will be dropped and this counter will also be incremented. If sess_dropped isn’t equal to zero, either your varnish is overloaded or it thread pool is too small.

Cache performance metrics

Using the cache_hit and cache_miss counter, you can calculate the cache hit ratio:

ratio = cache_hit / (cache_hit + cache_miss)

This derived metric provides visibility into the effectiveness of the cache. The higher the ratio the better. A ratio above 0.7 is considered as ‘good’. If your ratio is ‘bad’ you should check which objects aren’t cached and why.

n_lru_nuked: the LRU (Least Recently Used) nuked objects counter should be watch closely. If the counter value increase a lot that probably means varnish is evicting objects at a faster rate then usual because of memory shortage. In this case you may want to increase the cache size if possible.

Thread metrics

threads_failed: the number of times varnishd unsuccessfully tried to create a thread. A value greater then zero likely indicate you reach the server limits. It could also append if you try to spawn a huge number of thread in a short time. The latter case usually occurs right after varnish is started, and can be corrected by increasing the thread_pool_add_delay value.

threads_limited: number of times a thread needed to be created but couldn’t because varnishd already maxed out its capacity. If you have a value greater the zero and still have available resources left, you should increase the thread_pool_max value.

Backend metrics

backend_fail: number of backend connection failures. This counter should be very close to zero. If it’s not the case, it could means you have:

network issues
overloaded/laggy backend (time to first byte or between bytes exceeded)
unresponsive backend

Further Reading and sources

https://www.datadoghq.com/blog/top-varnish-performance-metrics/

2016-07-202017-02-09

Startup blogs

Many startups and technology companies publish a blog in which they talk about their technical problems. These blogs are a great source of first-hand in information for developers and admins. You can find a post in french with a good listing of blogs to add into our favorite RSS-reader here.

2016-06-202017-02-20

[Varnish] Purge and proxies

An interesting article to read about IP filtering and purge method with using a proxy (like an nginx for TLS offloading): here.

2016-02-182017-02-09

Generality on PHP

The content of this post was written by myself as an entry at my work wiki. It’s intended for new administrator with few knowledge of web environment.

What is PHP ?

PHP is a server-side scripting language designed for web development. PHP code may be embedded into HTML code, or it can be used in combination with various template systems, content management system and frameworks. PHP code is usually processed by an interpreter implemented as a module in the web server (like apache mod_php) or as a separated executable (PHP-FPM).

PHP versions

PHP core was rewritten several time, breaking compatibility each time. Since PHP4 the official interpreter is named “Zend Engine”. PHP4/ZendEngine 1.0 was first released on May 2000 and deprecated in August 2008. PHP5/ZendEngine 2.0 was first released in 2004 and is still actively maintained. PHP6 was an experimental branch for implementing unicode support. It’s no longer in use. The next branch is called PHP7/ZendEngine 3.0 and was first released on December 2015.

PHP and webservers

In order to “connect” PHP with your webserver you can use two different system :

an “embedded” interpreter: mod_php for Apache HTTPd, isapi for IIS, etc..
an external fast-cgi process: php-fpm

Embedded interpreter was the historical choice but since PHP5.5 it’s recommended to use the fast-cgi approach. In each case you need to load the appropriate module on your webserver.

+ Embedded interpreter: mod_php

mod_php is an apache module who load the official interpreter (Zend Engine) inside it. With mod_php the code is executed by apache (user www-data for Debian). It’s not possible to use several unix user for each application. Setting files are loaded only once, at startup. Any change in configuration imply restarting apache. mod_php can also be memory hungry because each new apache child process load the PHP interpreter even if no code is executed.

+ External fast-cgi process: PHP-FPM

PHP-FastCGI Process Manager is a daemon implementing the fast-cgi protocol for PHP. PHP-FPM is the official implementation since PHP5.3.3, superseding other fast-cgi implementation like FCGI, SpawnFCGI, etc…

PHP-FPM is more efficient than mod_php because spawning process is adaptative. FPM can start workers with different uid/gid and different setting files. This allows much greater security and scalability because the webserver and the code interpreter can be split into their own individual server environments if necessary. PHP-FPM can also shared opcode cache across multiple processes.

PHP-FPM runs as a standalone daemon, but you still need module to connect it to your webserver:

for nginx : ngx_http_fastcgi_module
for apache 2.4 and greater : mod_proxy_fcgi
for older apache version : mod_fastcgi (this setup is not recommended)

HHVM

Beside the official interpreter (Zend Engine) several implementation of PHP exist : Pipp, Phalanger, HHVM. These implementations are partial! There is no warranty your application can work with them.

The most interesting alternative implementation is HHVM aka. HipHop Virtual Machine. HHVM actually works on the same principle as the JVM (Java Virtual Machine). HHVM not only translate PHP code into an high-level bytecode (more or less like other opcodes solutions) but also execute it on an JIT compiler. PHP code performance can be increase by a factor 2 to 5 when using HHVM.

PHP accelerator

In order to improve PHP performance several “accelerator” extension have been made. These extensions all work on the same principle: they store the already parsed PHP code into a pseudo-bytecode, called opcode, generally keep in memory. This technique reduce web applications response’s times drastically.

+ APC

For PHP version upto 5.3 APC is the recommended accelerator.
You can adjust the quantity of memory used by apc into /etc/php5/apache2/conf.d/apc.ini. A value of 256Mb is recommended.

It possible but not advised to install APC on PHP 5.4. It has been reported that such configuration can lead to execution errors that break whole applications. APC can’t be install on newer PHP version.

+ opCache

Since version 5.5 PHP has it own built-in accelerator: opCache. You can install and configure OpCache on PHP 5.4 manually. It’s also possible to fine-tune opCache. We usually use these values:

opcache.memory_consumption=256
opcache.interned_strings_buffer=8
opcache.max_accelerated_files=4000
opcache.revalidate_freq=60
opcache.fast_shutdown=1
opcache.enable_cli=1

APCu

In addition to opcode caching APC also offer a way to cache serialized object, pretty much like redis does. This functionality isn’t provided by opCache so a new extension was created: APCu. APCu only provide object caching.