Paradyn Week 2012 was held at the University of Maryland. I went there to see what other folks were doing in the area of debugging and runtime software monitoring with executable binaries modification. I wanted to compare that approach with our current tracing methods.
Dyninst is a set of tools to modify assembly code. It's like a swiss knife for binaries. It can disassemble them, analyse instructions, recover control flow, and then insert or delete code. It generates code for x86, amd64, ppc32 and ppc64 and few other exotic architectures.
The first presentation was about data flow analysis. The most interesting part in my point of view was the liveness analysis. It computes the set of registers read or written in a function. While tracing, only the subset of registers that actualy contains values are saved. The performance benefit has not yet been evaluated, but the hypothesis is that if fewer registers are saved at each tracepoint, then it will lower the overhead. But, since compilers tries to use as much registers as possible and that under x86 there are very few general purpose registers, probably the number of live registers, on average, is close to the entire set of registers.
Then, there were a talk of Josh Stone from Redhat about integrating Dyninst with SystemTap. The core idea is to use STP scripts and compile them as shared library to instrument userspace applications. The current implementation is able to connect static tracepoints probes to handler functions. For this proof of concept, the handler prints the function name to the console.
There were a talk about debugging at extreme scale. The scale considered is 100k nodes and 1M cores. The idea is to script the debugging phase instead of trying to use interactive debugger. The debugging script is added to the cluster scheduler queue to run later. MRNet, a tree network overlay library, is used to control the debugger. A benchmark shows that setting a breakpoint on 1M cores takes about 200ms! It's also used to gather results with a reduction algorithm to make it also scalable, instead of using only concatenation of results.
The self-propelled instrumentation presentation was about instrumenting distributed applications. It follows the control path and instrument on the fly the application. All the instrumentation is in userspace, using executable patching. If I understood well, when a client connects to a server on a different machine, then a background ssh connection is made to the other host, the instrumentation is injected to the peer process, and the program continues. The instrumentation itself is a callback that is attached to a function. For the demo, the function was performing printf.
Besides that, people were so welcoming, and it was a great pleasure to share ideas. Cherry blossom are everywhere here in Washington and College Park, we should definetely have more in Montreal!