Getting a stack trace of a running PostgreSQL backend on Windows
From PostgreSQL wiki
Before you start
There are a few things you need to be aware of to get a useful stack trace:
- To get a useful stack trace, your symbol path must be configured to include the debugging symbol (.pdb) files of the programs you want to debug. See the section on the symbol path below.
- It appears that debugging a 32-bit PostgreSQL on 64-bit windows produces useless results, at least when the 64-bit debugging tools are used
- If you plan to use Visual Studio, you'll need a paid version, or Express 2008. Express 2010 will not work because critical features were removed.
For more information, see the section on how to verify that you have a useful stack trace, below.
Installing Debugging Tools for Windows and Process Explorer
Microsoft Windows does not come with built-in debugging tools. You will need to install the Debugging Tools for Windows, which you can download for:
I strongly suggest that you install these tools to the standard locations.
You will probably also want Process Explorer from the Sysinternal suite.
Once you have installed Process Explorer and the Debugging Tools for Windows, you'll need to do a little bit of setup so you can get useful stack traces.
Note: The instructions below assume your copy of Windows is installed on
c: so you may have to adjust them if this is not the case.
Configuring the symbol path
First, you need to set the symbol path. First you have to figure out what it is, since it depends on the version of PostgreSQL you've installed, the language of your version of Windows, and where PostgreSQL was installed to. What you need to find is the location of the
symbols folder inside the PostgreSQL install directory. If you've used the EnterpriseDB 1-click installer for PostgreSQL 8.4 and you're using an English version of Windows, it'll be:
I'll assume that's the case in the following instructions, so adjust the path shown to match your system if required.
To set the symbol path, you'll need to use the System control panel. You can get to this quickly and conveniently with all versions of Windows by holding down the Windows key and pressing the BREAK key (it's often under the Pause key). You can also find it in the Control Panel; you may need to search for "system". If you can't find it, open the Start menu and click "Run" then enter "sysdm.cpl" and press enter.
In the System control panel, click on the Advanced tab and then click "Environment Variables". An "Environment Variables" dialog will appear. In that dialog, under the label "User Variables for <username>" you'll see a list. What you need to do is click the "New" button under that list.
In the dialog that appears, enter the variable name exactly as:
Note the leading underscore ("_") and the fact that the variable name is ALL UPPER CASE with UNDERSCORES_BETWEEN_WORDS.
The value is in two parts. The first part depends on where PostgreSQL is installed, as described above. The second part tells debugging tools to use a Microsoft server (the Symbol Server) to find out about parts of Windows its self.
If you don't have an active Internet connection you can download the Windows symbol package for your version of Windows and install it, but be aware that this will be a BIG download. You'll then need to adjust the symbol path shown above to point to it.
Setting up Process Explorer
Open Process Exploer from the start menu, use "procexp.exe" in the Start->Run dialog. Read and accept the license agreement if prompted.
Now, in the "options" menu choose "configure symbols". First, note the "symbols path" entry, which should be the same as the value you entered for _NT_SYMBOL_PATH in the System control panel. If it isn't, you've made a mistake setting up the environment variable for the symbol path in the System control panel.
There is another entry in the "configure symbols" dialog labelled "dbghelp.dll path". If this begins with "c:\windows" you'll need to change it to the alternative provided by the Debugging Tools for Windows. The exact location varies depending on your debugging tools version. Use the "..." button, type
%ProgramFiles% into the "file name" field and press enter. Now find the debugging tools for windows folder, open it, and find dbghelp.dll. Double click on it. The dbghelp.dll path should've changed to something like:
C:\Program Files\Debugging Tools for Windows (x86)\dbghelp.dll
You can now click "OK".
Checking the setup
You should now be able to look at the stack trace of a commonplace process that's pre-installed on Windows systems. I suggest starting Notepad (start->run->notepad.exe), then launching Process Explorer, double-clicking on "notepad.exe" in the process list, switching to the "threads" tab, and double-clicking on the only entry in the list shown there. The displayed stack trace should look like this (on Windows XP SP3 at least):
ntkrnlpa.exe!KiSwapThread+0x46 ntkrnlpa.exe!KeWaitForSingleObject+0x1c2 win32k.sys!xxxSleepThread+0x192 win32k.sys!xxxRealInternalGetMessage+0x418 win32k.sys!NtUserGetMessage+0x27 ntkrnlpa.exe!KiFastCallEntry+0xf8 ntdll.dll!KiFastSystemCallRet USER32.dll!NtUserGetMessage+0xc notepad.exe!WinMain+0xe5 notepad.exe!WinMainCRTStartup+0x174 kernel32.dll!BaseProcessStart+0x23
If it looks like this instead (lots of lines with just "modulename+offset" not "modulename!symbolname+offset"):
ntkrnlpa.exe+0x6a822 ntkrnlpa.exe!ZwYieldExecution+0x1938 ntkrnlpa.exe!ZwYieldExecution+0x19a4 win32k.sys+0x2f35 win32k.sys+0x1b0d win32k.sys!EngQueryPerformanceCounter+0x5af ntkrnlpa.exe!KeReleaseInStackQueuedSpinLockFromDpcLevel+0xb14 ntdll.dll!KiFastSystemCallRet notepad.exe+0x2a1b notepad.exe+0x7511 kernel32.dll!RegisterWaitForInputIdle+0x49
... then your symbol path is set up wrong or there's something interfering with access to the symbol server. See #No symbolic names? Why?.
You might also see a backtrace like this:
00000000`0013e6b8 00000000`73d5282c wow64cpu!CpupSyscallStub+0x9 00000000`0013e6c0 00000000`73dcd07e wow64cpu!WaitForMultipleObjects32+0x32 00000000`0013e780 00000000`73dcc549 wow64!RunCpuSimulation+0xa 00000000`0013e7d0 00000000`779884c8 wow64!Wow64LdrpInitialize+0x429 00000000`0013ed20 00000000`77987623 ntdll!LdrpInitializeProcess+0x17e2 00000000`0013f220 00000000`7797308e ntdll! ?? ::FNODOBFM::`string'+0x2bea0 00000000`0013f290 00000000`00000000 ntdll!LdrInitializeThunk+0xe
with lines about wow64. This backtrace is useless for debugging. You're debugging a 32-bit binary on 64-bit windows with 64-bit debugging tools. You might be able to get a useful trace by installing and using the 32-bit debugging tools, but I haven't verified that.
Getting the stack trace
Phew. You now have things configured and can actually start collecting information.
Important: Don't waste your time generating stack traces that're useless. Please read #How to make sure a stack trace is useful before spending lots of time collecting information or posting to the mailing list. Your stack trace needs to contain "!SymbolicNames" as well as "+offsets" to be much use.
If you are collecting information about a postgresql backend that appears to be in an infinite loop or using too much CPU, you will need to take more than one stack trace. You should try to get quite a few over a short period of time so that developers can tell where it's spending its time. One stack trace alone doesn't really tell you a lot about what it's doing overall.
Getting the process ID of the PostgreSQL backend
Identifying the process ID of the PostgreSQL process you're interested in isn't always easy. If you have an active connection to the backend you want to look at, you can run
select pg_backend_pid(); to get it's process ID. Sometimes it's obvious - it's the one using constantly high amounts of CPU as shown in Process Explorer, for example. At other times, though, you'll need to use psql or PgAdmin III to look in the
pg_catalog.pg_locks tables/views to find out the process id ("pid" or "procpid") of the process you're interested in.
Getting a stack trace using Process Explorer
Launch Process Explorer from the Start menu, or by running "procexp.exe".
You'll need to find the
postgres.exe instance you're interested in. It might be one showing consistently high CPU usage, but if not, see the instructions above for finding it by process ID.
Once you know which
postgres.exe backend you want to examine, double-click on it in the list in Process Explorer. In the dialog that appears open the "Thread" tab. You'll now see a list of threads. In most cases you're interested in the first thread in the list. Select it by clicking on it and click the "stack" button. After a short delay a list will appear in a new dialog, looking something like this:
ntkrnlpa.exe!KiSwapContext+0x2f ntkrnlpa.exe!KiSwapThread+0x8a ntkrnlpa.exe!KeWaitForSingleObject+0x1c2 ntkrnlpa.exe!KiSuspendThread+0x18 ntkrnlpa.exe!KiDeliverApc+0x124 ntkrnlpa.exe!KiSwapThread+0xa8 ntkrnlpa.exe!KeWaitForMultipleObjects+0x284 ntkrnlpa.exe!NtWaitForMultipleObjects+0x297 ntkrnlpa.exe!KiFastCallEntry+0xfc ntdll.dll!KiFastSystemCallRet ntdll.dll!ZwWaitForMultipleObjects+0xc kernel32.dll!WaitForMultipleObjectsEx+0x12c postgres.exe!pgwin32_waitforsinglesocket+0x1f0 postgres.exe!pgwin32_recv+0x90 postgres.exe!secure_read+0x17d postgres.exe!pq_recvbuf+0x71 postgres.exe!pq_getbyte+0x15 postgres.exe!SocketBackend+0x6 postgres.exe!PostgresMain+0xbe8 postgres.exe!BackendRun+0x204 postgres.exe!SubPostmasterMain+0x224 postgres.exe!main+0x177 postgres.exe!__tmainCRTStartup+0x10f kernel32.dll!BaseProcessStart+0x23
See how there's different text after the exclamation mark on each line? If that's not the case, your symbol path is probably set up wrong, and Process Explorer can't figure out what the program is doing. Without the text after the exclamation marks the stack trace is mostly useless, so you'll need to collect that information.
You need to copy the whole list. Click on the first entry, hold down the shift key, and press the "end" key (or scroll to the bottom and click the bottom entry), then click the "copy" button.
You can now paste the stack trace into your email program to send it to the PostgreSQL mailing list. By its self it is not useful, so you MUST provide the usual debugging information too - your OS version, PostgreSQL version, what's going wrong, what you're doing when it goes wrong, etc etc. See Guide to reporting problems.
Getting a stack trace using WinDBG
The Debugging Tools for Windows include a program called
windbg.exe, the Windows Debugger. It's a very powerful and useful tool, but it's not user friendly. Nonetheless, it provides better debugging information than Process Explorer can, and unlike Process Explorer it is capable of capturing debugging information from a
postgres.exe backend that crashes. Windbg.exe may also be used to remotely debug a system even if you don't have full GUI access. Note, however, that unlike Process Explorer using windbg may interrupt the execution of the backend or possibly (though not usually) even hold up the entire database system.
You will need to know the process ID of the backend you want to connect to. See "Getting the process ID of the PostgreSQL backend" above for how to find that out.
Once you know the process ID, you need to start WinDBG. It'll be in the Start menu under Debugging Tools for Windows. To connect to the PostgreSQL backend of interest, use the File -> Attach to Process menu item (shortcut: F6). Enter the pid you found earlier into the "process id" field and press enter.
windbg has now attached to that postgres backend. Its execution has paused, and it will appear to have hung. This is normal. You're in a command-line interface, but you don't really need to know much about it. All you'll need to do is type the command:
in and press enter, then select all the output (CTRL-A) and copy it (CTRL-C) to the clipboard. You can now paste this information into a text file, email, etc. It'll look something like this:
0 Id: b0c.b1c Suspend: 1 Teb: 7ffdf000 Unfrozen ChildEBP RetAddr 00cafc2c 7c90df5a ntdll!KiFastSystemCallRet 00cafc30 7c8025db ntdll!ZwWaitForSingleObject+0xc 00cafc94 7c802542 kernel32!WaitForSingleObjectEx+0xa8 00cafca8 00587866 kernel32!WaitForSingleObject+0x12 00cafcb4 0058ba4c postgres!pg_usleep+0x36 [c:\pginstaller-repo\postgres.windows\src\backend\port\win32\signal.c @ 56] 00cafd18 00596075 postgres!AutoVacLauncherMain+0x24c [c:\pginstaller-repo\postgres.windows\src\backend\postmaster\autovacuum.c @ 597] 00caff60 00502da7 postgres!SubPostmasterMain+0x2b5 [c:\pginstaller-repo\postgres.windows\src\backend\postmaster\postmaster.c @ 3900] 00caff7c 006b7aad postgres!main+0x177 [c:\pginstaller-repo\postgres.windows\src\backend\main\main.c @ 165] 00caffc0 7c817027 postgres!__tmainCRTStartup+0x10f [f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 597] 00cafff0 00000000 kernel32!BaseProcessStart+0x23 1 Id: b0c.95c Suspend: 1 Teb: 7ffde000 Unfrozen ChildEBP RetAddr 0162fecc 7c90d3aa ntdll!KiFastSystemCallRet 0162fed0 7c83145e ntdll!ZwFsControlFile+0xc 0162ff14 0058777f kernel32!ConnectNamedPipe+0x52 0162ffb4 7c80b6d9 postgres!pg_signal_thread+0x9f [c:\pginstaller-repo\postgres.windows\src\backend\port\win32\signal.c @ 275] 0162ffec 00000000 kernel32!BaseThreadStart+0x37 # 2 Id: b0c.208 Suspend: 1 Teb: 7ffdc000 Unfrozen ChildEBP RetAddr 0163ffc8 7c951e68 ntdll!DbgBreakPoint 0163fff4 00000000 ntdll!DbgUiRemoteBreakin+0x2d
If you've collected everything you needed to you can detach from the PostgreSQL proces (debug menu -> Detach debugee) and close WinDBG. Alternately, if you want to collect a few more stack traces at different points in execution, use the "G" command to tell PostgreSQL to resume running. When you want to interrupt it again, use "Break" from the Debug menu and enter
~*k to get another set of stack traces.
If you want to learn more about using windbg, this article is a good starting point. If you're familiar with
gdb you'll also want to read this article on gdb for WinDBG users that's pretty handy if you're going in the other direction too.
Getting a stack trace of a repeatable backend crash
If you're trying to track down a backend crash and you can reasonably predictably trigger the crash in a connection of your choice, you'll probably want to use windbg. You will need to attach windbg to the backend while it's still running (see above) and issue the "G" command in windbg to get the backend to continue running. Now you need to do whatever is required to trigger the crash in the backend using your connection to it.
When the backend crashes, windbg should interrupt the crash. At this point, you can run the following command in windbg to get some stack trace information about the crash:
Using crash dumps to debug random and unpredictable backend crashes
If the crashes appear to be random and you don't know how to trigger them, it's hard to connect windbg or Visual Studio to the problem
postgres.exe before it crashes.
windbg as the JIT (just-in-time) or post-mortem debugger won't help you, because PostgreSQL generally runs as a service under a different user account that cannot interact with the desktop. You could always
initdb a new cluster under your normal user account and use
pg_ctl to start the postmaster with that cluster manually, so you can JIT debug under your own user account where Pg can interact with the desktop. This isn't suitable for production use, though, and you might not be able to reproduce the problem that way.
In PostgreSQL 9.0 and above there is a crash dump hander included in PostgreSQL. To use it:
- Create a directory named
crashdumps(all lower case) in the PostgreSQL data directory (as shown by
- Give the PostgreSQL user (
postgresby default) "full control" of it in the security tab of the folder properties
- Run the problem code. You don't need to restart Pg or change any settings.
- When a backend crashes, a Windows minidump should be created in the
You can debug this file with Microsoft Visual Studio or with windbg, using much the same procedures you use when attaching to a process ID as explained above. Unfortunately the more recent Express editions of Visual Studio do not have the ability to debug minidumps.
Minidumps can also be sent to somebody else to debug. For this to work you must tell the other person the *exact* PostgreSQL version you are using, because the debug symbols vary from minor release to minor release. You can't debug a 9.1.0 minidump with debug symbols taken from a 9.1.1 install. It's often easiest to just send them the
.pdb files from your install alongside the minidump.
Crashes during postmaster startup
This useful article on debugging Windows services will help you deal with issues where the postmaster is crashing during startup of the Windows service for PostgreSQL.
Remote debugging with windbg.exe
You might not want to, or be able to, sit at the database server console for debugging. That won't stop you using windbg on PostgreSQL, though - you can launch a remote debugging server on the Pg machine and remote debug the machine.
- On the Pg server:
dbgsrv.exefrom the debugging tools for windows folder to the remote server running Pg (or install the debugging tools there).
- Add an exception for dbgsrv.exe in the firewall of the remote machine
- Launch dbgsrv.exe on the remote machine as an Administrator. Use the command:
dbgsrv.exe -t tcp:port=12345,password=fred. (If you need to, use
runas.exeto get admin rights).
- On the workstation you're debugging from:
- Launch windbg
- In the file menu, choose "Connect to remote stub..."
- As the connection string enter "tcp:server=YOUR_PG_SERVER_NAME_OR_IP,port=12345,password=fred"
- Click "OK"
- Windbg will now try to do all work via the remote debug stub, as if you were running windbg on that machine, so you can follow the instructions as given above.
Debugging using Microsoft Visual Studio or VS Express 2008
Windbg is free, quick and easy to install, supports remote debugging, and a real pain to use. You'll want something much nicer to use if you're trying to do more than just get a stack trace - and for some problems, you will indeed need more than a stack trace to know what's going wrong. If you intend to step through the execution of PostgreSQL's code and examine the value of variables, using Microsoft Visual C++ will make it a lot easier.
Microsoft Visual C++ has a much nicer debugger that I highly recommend using in favour of windbg.exe - if you have the option. The free 2008 express edition can be used for many debugging tasks; it lacks remote debugging support and just-in-time postmortem debugging (which the full, paid edition has) but otherwise does a very good job as a user-friendly debugger. Unfortunately VS 2010 removes the ability to attach to already-running processes, so unlike 2008 it's basically useless for debugging PostgreSQL. This means you can't use 2010 express edition at all, and you can't use 2008 express edition for debugging random crashes, but 2008 is still great for tracking down predictable crashes and runaway backends - anything where you can connect to the backend by process ID.
Once you've installed VC++ 2008 Express, launch it and use "Attach to Process" in the Tools menu to connect to the PostgreSQL backend you want. Make sure "show processes from all users" is checked, select the
postgres.exe process you want to debug, and click "Attach". You'll get a security warning, which you should read and understand before accepting.
VC++ will load symbols for PostgreSQL, which may take a few minutes if you haven't used VC++ to debug PostgreSQL before (it'll have to fetch them from the symbol server). Once VC++ finishes loading symbols, execution of the Pg backend will resume and your database system will be working normally.
Breaking into execution
You can now break into the execution of the postgres.exe backend with the break button on the toolbar. It looks like a pause button. If you're debugging a runaway backend, you'll want to break into its execution now. Alternately, if you're debugging a crash, you should trigger the crash now.
If VC++ detects a crash it'll display a dialog like this to give you the chance to break into the process's execution:
If you break into a process's execution while it's running in the kernel you'll see a dialog like this:
... which is common for PostgreSQL since it'll often be waiting for network or disk activity. You can generally ignore it.
You'll get a dialog telling you that "there is no source code available for your current location". Click OK. Now, on the bottom right (by default) you'll see a set of tabs named "call stack", "breakpoints", and "output". Click on "call stack".
You can see the stack of function calls that were in progress in Pg's main thread at the time you interrupted postgresql. (If you want or need to change threads, use the "threads" tab on the bottom left - double click on the thread of interest to debug it - though you usually want the main thread which is active by default).
In the call stack you can double-click on an entry to see where in the executable the function call arose. If you don't have source code, it'll offer to show you a disassembly, but you have source code for PostgreSQL so you can do better than that. (If you don't have source code for Pg, download and extract the sources for your version now. If you can't open the .tar.gz file, install [www.7-zip.org/|7-zip]). Double-click on the topmost line in the stack trace that starts with "postgres.exe" and a dialog will appear asking for a .c file:
It'll note the original location of the file at the top of the dialog. In my case, it wants
src\backend\port\win32\socket.c, so I find the postgresql sources (on my machine:
c:\developer\postgresql-8.4.0) and from there open
win32. Visual studio notices "win32.c" within the directory and opens it automatically. From now on, during this debugging session VS should find all other postgresql source files too.
The Visual Studio debugger is far too capable to fully describe here. I'll just give a few hints - and suggest that you use the documentation.
Now that you have sources displayed, you can step through the code line-by-line, run until the end of functions, step into functions, and otherwise control the execution of the PostgreSQL process in detail using the buttons on the toolbar. You can also examine local variables at the stack frame you're on using the "locals" tab on the left, set breakpoints in functions using the "breakpoints" tab on the right, etc.
For example, if I wanted to pause execution the next time PostgreSQL tried to receive data from the network I might set a breakpoint on the
pgwin32_recv function and resume execution:
If I then issue a query on the backend I'm debugging, Visual Studio will shortly break back into the debugger at the start of the
pgwin32_recv function. I can now step through the function's execution to see what functions are called, what values variables take, etc. I can add and remove breakpoints by clicking in the margin on the left of the source file display.
Note that you must be careful not to change the values of any variables or cause execution to skip lines using the "set next statement" option. If you're not careful you could crash the server or possibly severely corrupt your database.
How to make sure a stack trace is useful
A stack trace isn't always trustworthy or useful. This is particularly the case on Windows, where programs generally contain no debugging information and proper stack traces can't reliably be created without it.
If you're going to post a stack trace as part of a bug report or request for help, it's really important that it contains symbolic information about the call path, like the example posted above. If, instead, it contains just module names and numeric offsets within the modules, it doesn't tell the reader anything useful unless they know exactly what versions of the modules you have. Even then, it's difficult and cumbersome to read. So, make sure your stack trace contains lines showing
modulename!SymbolicName+offset, like this:
Take this trace, for example.
ntkrnlpa.exe+0x8dafe ntkrnlpa.exe+0x29a82 ntkrnlpa.exe+0x33198 hal.dll+0x6199 hal.dll+0x63d9 hal.dll+0x6577 hal.dll+0x3902 postgres.exe!process_implied_equality+0x18d50e
The C function <code>process_implied_equality(...) from
backend/optimizer/plan/initsplan.c is deep in the query planner, and is (according to the comments) "currently used only when an EquivalenceClass is found to contain pseudoconstants". It's not going to be calling into
hal.dll - not, at least, without a fairly long call chain between it and
hal.dll. So we can't trust that
postgres.exe was even in the
process_implied_equality function when it called into hal.dll and if it was we don't have any idea how it got there. The call trace is basically useless.
Compare it to the example stack trace shown earlier in the article, where we can clearly see that PostgreSQL has called
WaitForMultipleObjectsEx (to wait until data is delivered to a socket) via its win32 wrapper layer from the pg_getbyte function ... and so on.
... kernel32.dll!WaitForMultipleObjectsEx+0x12c postgres.exe!pgwin32_waitforsinglesocket+0x1f0 postgres.exe!pgwin32_recv+0x90 postgres.exe!secure_read+0x17d postgres.exe!pq_recvbuf+0x71 postgres.exe!pq_getbyte+0x15 postgres.exe!SocketBackend+0x6 postgres.exe!PostgresMain+0xbe8 ...
No symbolic names? Why?
Your stack traces aren't showing !SymbolicNames, and you don't know why. Here are a few possibilities.
No symbols for the module
You might simply not have any way to get symbols for the module you're debugging. If you're finding that most of the stack trace (at least the parts in the system DLLs and NT kernel) have symbolic names shown, but other parts don't, it's quite likely you just have no debugging information (pdb files) for those parts.
This shouldn't be the case for PostgreSQL if you've set up the symbol path to point to the symbols for your version of PostgreSQL as well as the Microsoft Symbol Server, though.
_NT_SYMBOL_PATH not set
Are you sure you set the _NT_SYMBOL_PATH environment variable?
In the Start menu choose Run, then enter "cmd" and press enter.
In the resulting window, type in exactly:
set | findstr symbol
You should get the following result, allowing for differences in PostgreSQL versions, drive letters, etc:
_NT_SYMBOL_PATH is in capitals, with underscores between words and a leading underscore.
The same string should show up in Process Explorer, in Configure Symbols under the Options menu.
Add-in software firewall
Do you have an add-in software firewall installed? The firewall built in to Windows will not cause problems as it only affects listening sockets and incoming connections, but 3rd party firewalls often block outbound connections too. This may prevent access to the symbol server from working.
Make sure your software firewall allows access to
msdl.microsoft.com on port 80 (http), and/or add exemptions for
Initially written by Craig Ringer --Ringerc 10:06, 29 July 2009 (UTC)
Sections on remote debugging, software firewall, troubleshooting bad stack traces, identifying a useful stack trace, and debugging with Visual Studio added --Ringerc 06:25, 22 August 2009 (UTC)