[Varnish] Basic performance tuning

Storage engine

Varnish has two storage engine: file and malloc. Both need you to specify (inside the /etc/default/varnish file on Debian) the amount of memory used for caching objects. The difference between the two is how varnish access the cache content.

With file storage varnish create a mapper file on the disk and bind it to virtual memory using the mmap system call. All operations are done via the mapper file. This mode is pretty simple, very ‘posix-compliant’ and therefore portable but… not so great for high performance, even if you use very fast drives.

Now that doesn’t mean file storage is completely useless. It’s a good choice for particular case when you can’t have enough RAM to cache all needed content (and adding SSD-like card like a Fusion-io isn’t an option) because the speed of your IO subsystem will always overcomes content generation. But that shouldn’t be your first pick.

Next we have the malloc storage where varnish directly reserve chunk of memory for each cache’s object, using the system call of the same name. This is the fastest type of cache as memory access latency and throughput are a few orders of magnitude lower/faster than even the fastest SSD drives. My recommendation is always to choose this storage first, and only change in last resort.

Worker threads

Worker threads are the ones that deliver cached objects to clients. You can tune them by playing on three parameters: thread_pool, thread_pool_min and thread_pool_max.

thread_pool is the number of pools that threads will be grouped into. The default value is 2. You can increase the number of pool but you shouldn’t exceed the number of CPU cores.

thread_pool_min defines the minimum amount of threads that always need to be alive per pool. As creating threads is a time-consuming operation it better to always keep at least 50 threads alive. In case of a sudden spike in traffic, you will have enough threads to handle the first wave of requests while varnish spawn new threads.

thread_pool_max defines the maximum amount of threads per pool. This parameter is trickier, as the ideal value depends on available resources and the traffic you want to handle. Usually you don’t want to go over 5000 threads as specified in the documentation.

Varnish shared memory log

The VSL file is used to log most traffic data. It operates on a circular non-persistent buffer of 80MB by default. This file is used by tools like varnishlog or varnishtop. On a Debian system, you can find it into the /var/lib/varnish directory. To improve performance you can move the whole directory inside a tmpfs.