Callgrind is a profiler that helps to find hotspot in the application. One limitation of a profiler is that the time spent waiting is not taken into account, for instance I/O, resources or other tasks. This situation is shown in figure 1, where we traced the command "sudo apt-get install tree --yes". The actual computation done in the apt-get process accounts for 63% of the elapsed time, while waiting time is about 37%. The callgrind results covers only instructions from the application, not the waiting time.
Figure 1: State of apt-get process according to time. |
How to effectively understand what the application was waiting for? For this matter, we performed a waiting dependency analysis on all tasks on the system. The event sched_wakeup indicates the waiting source. Figure 2 shows the result of waiting dependencies among tasks for the command.
|
Figure 3: I/O wait of dpkg according to time. |
Figure 4: Execution of mandb according to time. |
7ffff729a30b __execvpe (/lib/x86_64-linux-gnu/libc-2.17.so)
7ffff75a4de4 pipecmd_exec (/usr/lib/x86_64-linux-gnu/libpipeline.so.1.2.2)
409813 [unknown] (/usr/local/bin/mandb)
7ffff75a4e59 pipecmd_exec (/usr/lib/x86_64-linux-gnu/libpipeline.so.1.2.2)
7ffff75a662a pipeline_start (/usr/lib/x86_64-linux-gnu/libpipeline.so.1.2.2)
408d96 find_name (/usr/local/bin/mandb)
40449c test_manfile (/usr/local/bin/mandb)
404708 testmandirs (/usr/local/bin/mandb)
405035 update_db (/usr/local/bin/mandb)
40a1c6 mandb (/usr/local/bin/mandb)
40a4fb process_manpath (/usr/local/bin/mandb)
403777 main (/usr/local/bin/mandb)
7ffff71f9ea5 __libc_start_main (/lib/x86_64-linux-gnu/libc-2.17.so)
403b09 _start (/usr/local/bin/mandb)
At the end of apt-get execution, there is a sleep of 500 ms, shown in blue in the figure 2. Here is the location in the code of this nanosleep, again using perf callchain feature:
ffffffff816cc0c0 __schedule ([kernel.kallsyms])
ffffffff816cc799 schedule ([kernel.kallsyms])
ffffffff816cb66c do_nanosleep ([kernel.kallsyms])
ffffffff810827f9 hrtimer_nanosleep ([kernel.kallsyms])
ffffffff8108292e sys_nanosleep ([kernel.kallsyms])
ffffffff816d616f tracesys ([kernel.kallsyms])
7ffff72978c0 __GI___libc_nanosleep (/lib/x86_64-linux-gnu/libc-2.17.so)
7ffff7b8ac95 pkgDPkgPM::DoTerminalPty(int) (/usr/lib/x86_64-linux-gnu/libapt-pkg.so.4.12.0)
7ffff7b93720 pkgDPkgPM::Go(int) (/usr/lib/x86_64-linux-gnu/libapt-pkg.so.4.12.0)
7ffff7b26585 pkgPackageManager::DoInstallPostFork(int) (/usr/lib/x86_64-linux-gnu/libapt-pkg.so.4.12.0)
415a1b [unknown] (/usr/bin/apt-get)
418dfc [unknown] (/usr/bin/apt-get)
7ffff7af96c2 CommandLine::DispatchArg(CommandLine::Dispatch*, bool) (/usr/lib/x86_64-linux-gnu/libapt-pkg.so.4.12.0)
40a2ec [unknown] (/usr/bin/apt-get)
Here is the detail of pkgDPkgPM::DoTerminalPty() function from apt/apt-pkg/deb/dpkgpm.cc:
/*
* read the terminal pty and write log
*/
void pkgDPkgPM::DoTerminalPty(int master)
{
unsigned char term_buf[1024] = {0,0, };
ssize_t len=read(master, term_buf, sizeof(term_buf));
if(len == -1 && errno == EIO)
{
// this happens when the child is about to exit, we
// give it time to actually exit, otherwise we run
// into a race so we sleep for half a second.
struct timespec sleepfor = { 0, 500000000 };
nanosleep(&sleepfor, NULL);
return;
}
if(len <= 0)
return;
FileFd::Write(1, term_buf, len);
if(d->term_out)
fwrite(term_buf, len, sizeof(char), d->term_out);
}
The comment mentions a possible race, but the caller may be fixed to account for this situation. IMHO, this function should not sleep if an error occurs while reading the file descriptor, and should instead return immediately.
This was a small example of what can be achieved with the waiting dependency analysis from a kernel trace. The analysis plug-in will be available soon in Eclipse Linux Tools, and uses standard tracepoints from the Linux kernel, recorded with LTTng. I hope that this information will help the community to improve performance of installing packages with APT.
Cheers!