Friday, April 25, 2008

Determining Signal Definitions In Linux and Unix

Greetings,

Today, we're going to take a quick look at simple return codes (or errno values) that get returned by the bash and ksh shells in Linux (and Solaris Unix). This complements a post we did a while back on trapping signals in Perl and the shell, and adds a bit more to understanding of the fork-and-exec process introduced in our post on running a login shell on a network port.

While these signals are subject to change, the POSIX standard is fairly consistent between most Unix and Linux systems. If you ever need to find out what a signal means, you have a number of options at your disposal to determine that information.

1. At the process level on Solaris, you can figure out what signals are associated with what processes by using the "psig" command. This command will list out all signals associated with a given process (assuming you know its process ID). A sample of this process-finding process would be something like:

host # ps -ef|grep "[l]pd"
daemon 2073 1 0 Mar 21 ? 0:00 /usr/local/sbin/lpd
host # psig 2073
2073: /usr/local/sbin/lpd
HUP caught sigacthandler
...
ILL default
...
PIPE ignored
...


basically, you'll get a ton of output. And none of it may make any sense to you. That's okay for now :)

2. At the process level on Linux, you can get approximately the same functionality from the "crash" command ( used interactively, if available ) or, even better, this publicly available script written specifically to emulate Solaris' psig functionality, for use on Linux. It's called psig.sh for Linux and is an excellent tool for replicating the psig experience for Linux users who are used to Solaris' proc commands.

3. On either Linux or Unix (any flavor), you can determine what signals are handled by your machine, what their names are and what their signal numbers are, by checking your system include files. On Solaris, the header file that contains this information is almost guaranteed to be named /usr/include/sys/iso/signal_iso.h (at least for Solaris 9 and 10). In Linux, it will generally be very specific to the architecture of your build system (something like: /usr/include/asm-x86_64/signal.h)

In any event, you can follow a simple process to figure out where your system's signal definitions are loaded, no matter what flavor of Unix or Linux you're running. It all comes down to derivation and, luckily, you'll always have the same starting point (I may be wrong on this, but I've never experienced it ;).

Here's a quick command line walkthrough of two different ways to figure out where your signal definitions are listed (that is, the include/header file that defines the signal numbers, their abbreviated signal names and definitions) that can be used on almost any *nix system. As a starting point, /usr/include/signal.h is probably the best, since it exists on most systems. We'll also use SIGHUP (the standard HangUp signal and signal number 1) to guide our search:

The painful way:

host # grep SIGHUP /usr/include/signal.h|grep -w 1
host # grep include /usr/include/signal.h
#include <sys/feature_tests.h>
#include <sys/types.h>
...
host # grep include /usr/include/signal.h|sed 's/^.*<\([^>]*\)>/\1/'|xargs -ivar grep SIGHUP /usr/include/var
host # grep include /usr/include/signal.h|sed 's/^.*<\([^>]*\)>/\1/'|xargs -ivar grep "^#include" /usr/include/var|grep sig
#include <sys/iso/signal_iso.h>
#include <sys/siginfo.h>
...


... and so on, and so on, until you find the right file. Basically, just follow the includes until you reach the end. This can be tedious and time consuming.

The incredibly easy way:

Solaris_9_or_10_host # find /usr/include|xargs grep SIGHUP /dev/null|grep -w 1
/usr/include/sys/iso/signal_iso.h:#define SIGHUP 1 /* hangup */
<--- This is our file :)

or

Linux_host # find /usr/include|xargs grep SIGHUP /dev/null|grep -w 1
/usr/include/bits/signum.h:#define SIGHUP 1 /* Hangup (POSIX). */
<--- Note that all 3 of these entries are valid, with this header file being the most generic. Our next command can narrow down the most "specific" correct result.
/usr/include/asm-x86_64/signal.h:#define SIGHUP 1
/usr/include/asm-i386/signal.h:#define SIGHUP 1
Linux_host # uname -pr
2.6.5-7.286-smp x86_64
<--- Now we know that, if we want to be extra sure, the "/usr/include/asm-x86_64/signal.h" is the signal definition file we should be looking at to determine our signal information.

And that's all there is to it :) As an interesting bit of trivia, if you receive an errno value from a process that's higher than 128, you can easily deduce what signal stopped that process. All you need to do is deduct 128 from the return code, like this:

host # sleep 2
host # echo $?
0
host # sleep 200
^C
host # echo $?
130


In this case, when run normally, our sleep command exited with a return code of 0 (success). When we did a control-C during the sleep command, it returned a code of 130. Of course, you won't find this signal number defined in any of your include files, but, utilizing the shell's native reporting of signal interrupts, all you need to do is subtract 128 from 130 and you now know that the sleep command was killed by a signal 2, which is defined as:

SIGINT 2 /* interrupt (rubout) */ <--- on Solaris

and

SIGINT 2 /* Interrupt (ANSI). */ <--- on the Linux box I'm using.

Both answers, although formatted differently, indicate the same signal. Nice :)

Below, I've listed the standard signals for your convenience (from a fairly generic file - signum.h) and take no credit for writing them myself ;)

Enjoy,

Some Basic signals from /usr/include/bits/signum.h on Linux (Note that I'm only listing the basic signals and not I/O polling signals, etc)

#define SIGHUP 1 /* Hangup (POSIX). */
#define SIGINT 2 /* Interrupt (ANSI). */
#define SIGQUIT 3 /* Quit (POSIX). */
#define SIGILL 4 /* Illegal instruction (ANSI). */
#define SIGTRAP 5 /* Trace trap (POSIX). */
#define SIGABRT 6 /* Abort (ANSI). */
#define SIGIOT 6 /* IOT trap (4.2 BSD). */
#define SIGBUS 7 /* BUS error (4.2 BSD). */
#define SIGFPE 8 /* Floating-point exception (ANSI). */
#define SIGKILL 9 /* Kill, unblockable (POSIX). */
#define SIGUSR1 10 /* User-defined signal 1 (POSIX). */
#define SIGSEGV 11 /* Segmentation violation (ANSI). */
#define SIGUSR2 12 /* User-defined signal 2 (POSIX). */
#define SIGPIPE 13 /* Broken pipe (POSIX). */
#define SIGALRM 14 /* Alarm clock (POSIX). */
#define SIGTERM 15 /* Termination (ANSI). */
#define SIGSTKFLT 16 /* Stack fault. */
#define SIGCLD SIGCHLD /* Same as SIGCHLD (System V). */
#define SIGCHLD 17 /* Child status has changed (POSIX). */
#define SIGCONT 18 /* Continue (POSIX). */
#define SIGSTOP 19 /* Stop, unblockable (POSIX). */
#define SIGTSTP 20 /* Keyboard stop (POSIX). */
#define SIGTTIN 21 /* Background read from tty (POSIX). */
#define SIGTTOU 22 /* Background write to tty (POSIX). */
#define SIGURG 23 /* Urgent condition on socket (4.2 BSD). */
#define SIGXCPU 24 /* CPU limit exceeded (4.2 BSD). */
#define SIGXFSZ 25 /* File size limit exceeded (4.2 BSD). */
#define SIGVTALRM 26 /* Virtual alarm clock (4.2 BSD). */
#define SIGPROF 27 /* Profiling alarm clock (4.2 BSD). */
#define SIGWINCH 28 /* Window size change (4.3 BSD, Sun). */
#define SIGPOLL SIGIO /* Pollable event occurred (System V). */
#define SIGIO 29 /* I/O now possible (4.2 BSD). */
#define SIGPWR 30 /* Power failure restart (System V). */
#define SIGSYS 31 /* Bad system call. */
#define SIGUNUSED 31
#define _NSIG 65 /* Biggest signal number + 1


, Mike