vendredi 12 novembre 2010

Introduction to Linux Tracing Toolkit

The Linux Tracing Toolkit next generation (LTTng) is absolutely amazing. In this introduction, we will look at how to interpret the kernel trace of a small program by comparing it to strace. If you want to install components and get a trace on your own machine, you can get packages for the patched kernel and other tools for Ubuntu Maverick in LTTng's PPA on Launchpad. You will also need build-essential to compile the small C executable, strace and hexdump.

Here is the small program. It opens a file and write a integer into it. Then, it adds number in a for loop.
#include <stdio.h>
#define INT_MAX 2147483647
int main(volatile int argc, char **argv) {
    int i = 3;
    FILE *fd = fopen("test.out", "w");
    fwrite(&i, sizeof(int), 1, fd);
    fclose(fd);
    volatile long a;
    int x;
    for (x = 0; x<INT_MAX; x++) {
        a++;
    }
    return 0;
}

Compile it, test it and run it under strace:
$ gcc -o usecase usecase.c
$ ./usecase
$ strace -o usecase.strace ./usecase
Ok. Now, let's run the program under tracing, but before, we have to enable all trace points in the kernel.
$ sudo ltt-armall
To get a tight window around the command, I'm using the small script trace-cmd.sh (see at the end of the post). The script simply starts tracing, run the program and stop the trace. Invoke it like this (don't forget to set the script executable):
./trace-cmd.sh ./usecase
Now, we are ready to launch lttv-gui, which is the utility to look at the trace. To load a trace, select File->Add trace, then select the "trace-cmd" directory created in the same directory as the trace-cmd.sh script. Then, scroll down to the "./usecase" process in the control flow viewer, which is in the middle of the window. Each trace events are displayed under the graph. I will present three phase of the process. First, the creation and file access, then computation phase and the process exit. Trace is rotated to make it more readable. Events name and comments are embedded in the graph. strace events are in red, while related code is in blue.



In addition to system calls, we can look at the paging of an application and see process starvation. This information is not shown by strace. Another cool side effect is that, contrary to strace, the running program can't detect it's under tracing, then we can trace viruses that otherwise quit before they can be analyzed. Tracing reports much more information about kernel internals, which helps a lot to understand what's going on. Events timestamps of the kernel trace are in nanoseconds, so the precision is really high, and since trace points are static (simple function calls compiled right into the kernel), performance impact is ridiculously low, below 3% for core trace points! I'm running LTTng in background on my system without noticeable impact, in flight recorder mode.

I will be working on advanced trace analysis for the next three years or so for my PhD. Since LTTng got all the feature I need for tracing, I will use this tools. My first goal is to create a CPU usage view. Say that something was eating your CPU a minute ago, then with kernel tracing you would be able to go in the "flight recorder" history and see what was running at that time, and understand what was going on.

You can help me in two ways. First, by submitting your ideas about cool trace analysis we could perform and/or you need. Second, you can write to Linus Torvalds and ask him to merge LTTng in mainline. LTTng is the right stuff we need for tracing and it's a bit a shame that this not merged yet while other operating systems have this since a decade.

Next post will be about TMF, an eclipse plugin for trace analysis! Stay tuned!

Script trace-cmd.sh:
#!/bin/sh

if [ -z "$@" ]; then
    echo "missing command argument"
    exit 1
fi

cmd="$@"

name="cmd"
dir="$(pwd)/trace-$name"
sudo rm -rf $dir
sudo lttctl -o channel.all.bufnum=8 -C -w $dir $name
echo "executing $cmd..."
$cmd
echo "return code: $?" 
sudo lttctl -D $name