Profiling with OProfile
OProfile is an operating system level profiler for Linux that's known to give useful results profiling PostgreSQL.
Initial Setup
This needs to be done only once per system boot:
sudo opcontrol --init sudo opcontrol --setup --no-vmlinux
although you can re-issue the --setup command to change profiling options later.
If you need detail about what the kernel is doing, get debug symbols for your kernel; then the setup command looks something like
sudo opcontrol --setup --vmlinux=/usr/lib/debug/lib/modules/`uname -r`/vmlinux
There are a vast number of other options to the --setup command, but you don't need them for ordinary cases.
Virtual Machine Setup
If you are using a virtual machine, you might need to make oprofile use a timer instead of trying to use hardware performance counters that might not be available. You can do that on the fly by shutting down oprofile if it's running and executing:
sudo opcontrol --deinit sudo modprobe oprofile timer=1
You can make this setting permanent by adding a file to /etc/modprobe.d (at least on Fedora) called oprofile.conf with these contents:
options oprofile timer=1
Start/stop profiling
sudo opcontrol --start sudo opcontrol --reset ... exercise your debug-enabled program here ... sudo opcontrol --dump ; sudo opcontrol --shutdown
The test case should run at least a minute or two to get numbers with reasonable accuracy.
Analysis:
opreport --long-filenames | more opreport -l image:/path/to/postgres | more
If you really want detail:
opannotate --source /path/to/postgres >someplace
The --reset command zeroes the stats that these programs report on, so be sure to save the output you want before running another test case. Also, modifying the executable program-under-test invalidates the stats for it.
Additional analysis
Getting callgraph information from oprofile can provide additional detail about libraries that are being used, including using of the common libc library calls. Although it won't directly tell what function in libc is being called, you would see where the calls are coming from, which is usually enough to guess what the libc function is.
Important: to get useful callgraph information, you must compile with -fno-omit-frame-pointer.
You can also get the oprofile data, including callgraph, into kcachegrind, which is *very* helpful, using a script such like http://roberts.vorpus.org/~njs/op2calltree.py
Interpreting the results
A good example showing how to use the results from oprofile to help make a performance-related coding decision is oprofile results for stats collector test
Other alternatives
OProfile is preferred to gprof because of several limitations of gprof.
Another possibility is to use valgrind for profiling; example usage
Newer Linux systems (kernels 2.6.31 and later) can use the perf utility.
Installing the software
Whether OProfile works correctly is very dependent on how your Linux distribution was packaged. According to the Unladen-swallow project, "The oProfile packages in Debian Sid and Ubuntu Hardy, Jaunty, Karmic, and Lucid are all broken", with an alternate package suggested there you can swap for the system one. Debian Squeeze should have a working version, as shown in the good Ceph tutorial on using the program. You will need the linux-image-debug package to trace into the kernel, and when that's available you may be able to start the program without the "--no-vmlinux" flag.