Shell – Just another Sys Admin blog… wait really ?

2014-11-232017-01-13

runit

During the last ten years many initd alternatives have appeared. Some are very “SysV-like” like initng. Others are way more radical in their design both more modern and complex (systemd / upstart).

But what if you need a simple and lightweight alternative with supervision capability ?

Then runit is a very good choice, and you can use it to replace or complement initd.

Install runit

aptitude install runit

Add a service to supervise

The core of runit is the /etc/sv directory. This directory contains a subdirectory for each process that runit should manage. Let say we want to add a varnish service:

# vi /etc/sv/varnish/run
#!/bin/sh
exec 2>&1
. /etc/default/varnish 
rm -f /var/lib/varnish/$INSTANCE/*
exec varnishd -F $DAEMON_OPTS

Simple and neat. Note that the process shouldn’t be launch in background / daemonized mode. If necessary add the appropriate option. Then:

# chmod +x run
mkdir supervise
chmod 755 supervise

Now don’t forget to add a diversion to the official initscript and then enable the supervision:

ln -s  /etc/sv/varnish /etc/service/varnish

The sv command

sv status gu-monprojet
sv check gu-monprojet
sv up gu-monprojet
sv down gu-monprojet
sv restart gu-monprojet

As you can see commands are very straightforwards.

Note that sv can be use to send pretty much every unix signal (HUP, USR1, USR2, etc…).

Logging process output

Under runit process logging is dead simple: let your process send it data to STDOUT and svlogd will do the rest. This program collect your process’s data and save it into a system-standard location. It will take care of rotating log file if necessary by itself.

To add a log to a process create a log script:

# vi /etc/sv/varnish/log
#!/bin/sh
exec svlogd -tt /var/log/varnishd

Replace initd

To replace initd by runit, just follow the official documentation.

2014-09-082016-09-08

Override the TERM environment variable

The TERM environment variable is used by your terminal emulator to indicate which type of control string it supports. For the most part you should not override this variable, but there is a narrow set of circumstances where the default value can be incorrect or at least not optimal.

For example in most distributions the default value for a console terminal is linux and for a graphical terminal xterm. That’s fine, but don’t allow you to enjoy vim or screen 256 color mode in graphical mode.

A traditional xterm only supports 16 colors and this value is specify into the terminfo database.
As you can image, changing the xterm entry would have made users of the newer versions happy but also broken the configuration for others. Instead a new value xterm-256color was declared. That may seems fine but when you log in remotely to machines with an older terminal database, the value isn’t recognized.

In this case the only viable solution left is to override the TERM value in order to unset the 256 color mode. You can do it by adding something like that to the .bash_profile on the remote machine:

if echo $TERM | grep -q -- '-256color'; then
    export TERM=`echo -n $TERM | sed 's/-256color//'`
fi

2014-09-042016-09-13

GoAccess

GoAccess is a real-time ncurse weblog analyzer and interactive viewer. Contrary to Awstats and other similar product, GoAccess doesn’t keep any “history”, but in exchange it’s way more faster. Having ‘on the fly’ HTTP statistics is extremely useful when load suddenly increase on a front webserver.

For analyzing a given log file use the -f option, like this:

goaccess -f /var/log/apache/access.log

If you don’t use standard NCSA or common log format, you can specify the log format with --log-format=. For a more permanent solution you can redefine the log-format value inside the /etc/goaccess.conf setting file.

You can also generate an html report like this:

goaccess -f /var/log/apache/access.log -a -o report.html

2014-02-212017-01-31

Top, ps and cpu usage

One thing that mystify unix newbies is the result difference between top and ps commands about CPU usage:

# ps -o %cpu 21162
%CPU
5.5

but a top -p 21162 give us 18%. There is something wrong right ?
Nope. But the confusion rise from the fact that ps and top don’t have the same definition of what constitute CPU usage.

top give you an average value of CPU consumption per core on a short period of time (by default 3 seconds). A value of 200% means that the process “hogged” in average two core during the last 3 seconds.

On the other hand ps calculate it value on the process lifetime, and don’t take into account how many core have been hog. So a for example a value of 15.3% means that since the process is running 15.3% of it lifetime it has been bugging on the CPU. The other 84.7% of the time the process was doing nothing, probably waiting for some input to append.

So as you can see this two commands have a very different definition of what CPU usage is, and both value are relevant in their own.

2014-01-042016-09-08

Fastest method to delete a huge number of files

Let’s said you have a filesystem with a directory containing a huge number of files, something like half a million. What is the fastest method to delete so many files ?

Never use shell expansion

Before answering the question let’s state the obvious: you should never use shell expansion when dealing with a huge number of files.

cd huge_directory
rm -f *
-bash: /bin/rm: Argument list too long

As you can see rm didn’t even start because shell expansion produce a command that exceeds the ARG_MAX limit (128kb since kernel 2.6.23). So if you insist to use rm for the job (and you shouldn’t) at least do it the right way:

rm -rf huge_directory

It will internally list the files and directories it’s going to delete.

Using find

“If you can’t do it in one setting, divide and make a loop”. That’s the strategy of the find command. First let’s use it with the -exec parameter:

time find ./ -type f -exec rm {} \;
real    14m51.735s
user    2m24.330s
sys     9m48.743s

Not so great. In fact using find this way is very inefficient because it spawn an external rm process for each file ! Luckily, find have it own built-in -delete command:

time find ./ -type f -delete
real    5m11.937s
user    0m1.259s
sys     0m28.441s

That’s much better, but we can do better.

Using rsync

Using rsync for this task may seem a little strange in the first place, but it work really really well:

time rsync -a --delete emptydir/ huge_directory/
real    2m52.502s
user    0m2.772s
sys     0m32.649s

Clearly rsync is the winner. But why ? It’s a little tricky.

When deleting a file you invoke the unlink system call. This call removes a ‘link’ to the file’s inode. Once there is no more ‘link’ the system free the associated space by zeroing the inode. Pretty simple stuff. But how do it know which ‘link’ to unlink ? By using another system call that list the content for a given directory: readdir.

Now here the thing. readdir doesn’t list files in-order but randomly (in really not so randomly because it’s depend on inode number) and by packet of 32Kb. When there is a reasonable number of files that perfectly fine and quite efficient, but not in an over-crowded directory.

This behavior is the main reason why using ls or rm in a directory containing millions of files is such a pain in the ass. Each operation make hundred of readdir calls.

On the other hand rsync doesn’t rely on readdir but use it own implementation. It use a single huge buffer and list all files in-reverse order. This way rsync can unlink files after files without glancing a second time at the directory structure. Huge gain of time when dealing with millions of files.

2013-11-162017-02-09

Mosh

I think no body reading this blog need an introduction to SSH, the standard of remote access terminal since the end of the 90’s. We all know and love this protocol, and its main implementation openssh. But sometimes SSH strict and clean design can be a pain in the ass. During my on-call duty i sometime have no other choice than to work using only a poor 3G/EDGE mobile access. High latency and intermittent connectivity don’t play well with SSH. Even with a GNU screen session on the remote server that never an enjoyable moment.

It’s in such situations that a tool like mosh become interesting.

What’s Mosh ?

Mosh stand for Mobile Shell. Like SSH that a remote-terminal protocol, but designed with mobile access in mind. It allows roaming, support intermittent connectivity, predictive echoing and local buffering for line typing/editing/deleting (yep openssh waits for the server’s reply before showing you your own typing, now you understand the typing latency). All of these features make it way more convenient to use on a high latency and/or unreliable links than a standard SSH session.

Installing Mosh

Mosh need to be installed on both the client and the server. For Debian, there is only one package simply call mosh. It’s available in the official repository since Debian Wheezy.

Using Mosh

It’s much simpler than what you think. Just type:

mosh username@server

and the mosh command will take care of everything. First it will log you using the ssh command, then start the mosh-daemon on the remote server. After that it close the ssh session and reconnect you to the mosh one. Note that by default the mosh-daemon chose a random UDP port between 60000 and 61000. If like me, you’re not a fan of subnet opening, you can use the -p parameter to force a specific port of your choice.

2013-11-012016-09-08

lsof – a more powerful command than you think

lsof is a command used to find out which files are open by which process. But as the Unix way of life is that everything is a file, lsof can be use for a lot more then that:

List processes which opened a specific file

# lsof /var/log/syslog
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
rsyslogd 13315 root    7w   REG   0,36   246213 15074839 /var/log/syslog

List opened files under a directory

# lsof +D /var/log/
COMMAND    PID          USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
php5-fpm  1968          root    2w   REG   0,36      434 15074914 /var/log/php5-fpm.log
php5-fpm  1968          root    5w   REG   0,36      434 15074914 /var/log/php5-fpm.log
apache2   7466          root    2w   REG   0,36      279 15076913 /var/log/apache2/error.log
...

List all open files by a specific process

# lsof -p 1968
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF      NODE NAME
php5-fpm 1968 root  cwd    DIR               0,36     4096  15073281 /
php5-fpm 1968 root  rtd    DIR               0,36     4096  15073281 /
php5-fpm 1968 root  txt    REG               0,36  9110296  15081382 /usr/sbin/php5-fpm
php5-fpm 1968 root  mem    REG              253,0           15081382 /usr/sbin/php5-fpm (path dev=0,36)
...

List opened files based on process names

# lsof -c ssh
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF       NODE NAME
sshd     8463 root  cwd    DIR               0,36     4096   15073281 /
sshd     8463 root  rtd    DIR               0,36     4096   15073281 /
sshd     8463 root  txt    REG               0,36   787080   15076801 /usr/sbin/sshd
sshd     8463 root  mem    REG              253,0            15077206 /lib/x86_64-linux-gnu/libnss_files-2.19.so (path dev=0,36)
...

You can use the -c parameter, multiple time.

Show network connections

# lsof -i
COMMAND     PID  USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
ssh         636 daber    3u  IPv4 1381573      0t0  TCP 10.10.32.54:54188->b1.vpn.ti.smile.fr:ssh (ESTABLISHED)
ssh         834 daber    3u  IPv4 1385285      0t0  TCP 10.10.32.54:60902->b3.vpn.ti.smile.fr:ssh (ESTABLISHED)
ssh         892 daber    3u  IPv4 1386338      0t0  TCP 10.10.32.54:39496->b2.vpn.ti.smile.fr:ssh (ESTABLISHED)
chromium   1476 daber   87u  IPv4 1429223      0t0  TCP zapan.dhcp.mpl.intranet:49404->par10s09-in-f35.1e100.net:https (ESTABLISHED)
...

You can also add additional parameters to filter on the port number. For example to only show SSH connections: lsof -i TCP:22
You can also specify a range: lsof -i TCP:1-1024

Show all files opened by a specific user

# lsof -u daber
COMMAND     PID  USER   FD      TYPE             DEVICE  SIZE/OFF       NODE NAME
ssh         636 daber  cwd       DIR               0,35     28672   13238276 /home/daber
ssh         636 daber  rtd       DIR                8,1      4096          2 /
ssh         636 daber  txt       REG                8,1    666088     659561 /usr/bin/ssh
ssh         636 daber  mem       REG                8,1     22952     524315 /lib/x86_64-linux-gnu/libnss_dns-2.19.so
...

Note that you can use ^ to inverse the command (exclude only a particular user).

Kill all process of particular user

# kill `lsof -t -u daber`

2013-06-012017-02-09

Zsync

zsync is a file transfer program that allows you to synchronize a local file with a remote server version. zsync is very efficient as it only downloads new parts of the file, using the same algorithm as rsync.

Server usage

Use zsyncmake to build a control file for zsync client, like this:

zsyncmake -z myiso.iso

Client usage

zsync http://foobar.com/daily-live/current/myiso.iso.zsync

2012-10-262017-02-09

nslookup

nslookup is a network administration tool for querying DNS servers. nslookup is very useful tool for debugging DNS record.

Query a domain name

Using the current ‘default’ DNS server:

# nslookup debian.org 
Server:         62.210.16.6
Address:        62.210.16.6#53

Non-authoritative answer:
Name:   debian.org
Address: 5.153.231.4

Using a specific DNS server:

# nslookup debian.org 8.8.8.8
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   debian.org
Address: 5.153.231.4

Query the MX Record

# nslookup -query=mx debian.org 
Server:         62.210.16.6
Address:        62.210.16.6#53

Non-authoritative answer:
debian.org      mail exchanger = 0 muffat.debian.org.
debian.org      mail exchanger = 0 mailly.debian.org.

Here we have two MX (mail exchange) server for the zone debian.org

Query the NS Record

# nslookup -query=ns  debian.org 
Server:         62.210.16.6
Address:        62.210.16.6#53

Non-authoritative answer:
debian.org      nameserver = dns1.easydns.com.
debian.org      nameserver = debian1.dnsnode.net.
debian.org      nameserver = dns4.easydns.info.
debian.org      nameserver = sec1.rcode0.net.
debian.org      nameserver = sec2.rcode0.net.

The NS record give the domain’s authoritative DNS servers list.

Query the SOA Record

# nslookup -query=soa  debian.org 
Server:         62.210.16.6
Address:        62.210.16.6#53

Non-authoritative answer:
debian.org
        origin = denis.debian.org
        mail addr = hostmaster.debian.org
        serial = 2016092612
        refresh = 1800
        retry = 600
        expire = 1814400
        minimum = 600

The SOA record (start of authority) give information about the domain, it TTL, the e-mail address of the domain administrator, the domain serial number, etc…

Further Reading and sources

http://www.thegeekstuff.com/2012/07/nslookup-examples

2012-05-022016-09-08

Split a file into multiple files

Let say you have a file with multiple sections, delimited by the character sequence -|, and you need to create multiple files, one for each section. How you do that ?

Basically you can use three tools for the job.

Use csplit

Csplit is a very useful and not well know utility, present into coreutils package.

csplit --quiet --prefix=outfile infile.txt  "/-|/+1" "{*}"

Use awk

awk '{print $0 " -|"> "output" NR}' RS='-\\|' infile.txt

Use bash/perl/python

Here i give you the bash one-liner version, but you can also use perl, python, ruby or any other scripting language:

cat infile.txt | ( I=0; echo -n "" > output0; while read line; do echo $line >> output$I; if [ "$line" == '-|' ]; then I=$[I+1]; echo -n "" > output$I; fi; done )