Profiling with OProfile

From PostgreSQL wiki
Jump to navigationJump to search

OProfile is an operating system level profiler for Linux that's known to give useful results profiling PostgreSQL.

Initial Setup

This needs to be done only once per system boot:

 sudo opcontrol --init
 sudo opcontrol --setup --no-vmlinux

although you can re-issue the --setup command to change profiling options later.

If you need detail about what the kernel is doing, get debug symbols for your kernel; then the setup command looks something like

sudo opcontrol --setup --vmlinux=/usr/lib/debug/lib/modules/`uname -r`/vmlinux

There are a vast number of other options to the --setup command, but you don't need them for ordinary cases.

Virtual Machine Setup

If you are using a virtual machine, you might need to make oprofile use a timer instead of trying to use hardware performance counters that might not be available. You can do that on the fly by shutting down oprofile if it's running and executing:

 sudo opcontrol --deinit
 sudo modprobe oprofile timer=1

You can make this setting permanent by adding a file to /etc/modprobe.d (at least on Fedora) called oprofile.conf with these contents:

 options oprofile timer=1

Start/stop profiling

 sudo opcontrol --start
 sudo opcontrol --reset
 ... exercise your debug-enabled program here ...
 sudo opcontrol --dump ; sudo opcontrol --shutdown

The test case should run at least a minute or two to get numbers with reasonable accuracy.

Analysis:

 opreport --long-filenames | more
 opreport -l image:/path/to/postgres | more

If you really want detail:

 opannotate --source /path/to/postgres >someplace

The --reset command zeroes the stats that these programs report on, so be sure to save the output you want before running another test case. Also, modifying the executable program-under-test invalidates the stats for it.

Additional analysis

Getting callgraph information from oprofile can provide additional detail about libraries that are being used, including using of the common libc library calls. Although it won't directly tell what function in libc is being called, you would see where the calls are coming from, which is usually enough to guess what the libc function is.

Important: to get useful callgraph information, you must compile with -fno-omit-frame-pointer.

You can also get the oprofile data, including callgraph, into kcachegrind, which is *very* helpful, using a script such like http://roberts.vorpus.org/~njs/op2calltree.py

Interpreting the results

A good example showing how to use the results from oprofile to help make a performance-related coding decision is oprofile results for stats collector test

Other alternatives

OProfile is preferred to gprof because of several limitations of gprof.

Another possibility is to use valgrind for profiling; example usage

Newer Linux systems (kernels 2.6.31 and later) can use the perf utility.

Installing the software

Whether OProfile works correctly is very dependent on how your Linux distribution was packaged. According to the Unladen-swallow project, "The oProfile packages in Debian Sid and Ubuntu Hardy, Jaunty, Karmic, and Lucid are all broken", with an alternate package suggested there you can swap for the system one. Debian Squeeze should have a working version, as shown in the good Ceph tutorial on using the program. You will need the linux-image-debug package to trace into the kernel, and when that's available you may be able to start the program without the "--no-vmlinux" flag.