Valgrind
Valgrind and Postgres
Postgres supports Valgrind memcheck directly - it is possible to include "client requests" in the memory allocator, providing detection of many additional memory others that would otherwise not be detected in vanilla builds. See src/include/pg_config_manual.h for full details of how to build Postgres with support for Valgrind memcheck instrumentation.
You should normally use MEMORY_CONTEXT_CHECKING with USE_VALGRIND; instrumentation of repalloc() is inferior without it.
Known Bugs
If you observe core dumps in autovacuum while running under Valgrind on x86_64 hardware, it's probably a known bug in valgrind 3.8.1 and earlier; see https://bugs.kde.org/show_bug.cgi?id=280114. If you're prepared to recompile Valgrind, apply the one-line patch shown there. Otherwise, the simplest answer is to set autovacuum = off in postgresql.conf while using Valgrind. However, it's not clear that "fix" will hide all instances of the issue.
Using Valgrind
One effective approach to gathering Valgrind memcheck instrumentation while running the regression tests is outlined here. A binary with debug symbols produces source file line numbers detail.
A Memcheck-hosted postgres can be started like this:
valgrind --leak-check=no --gen-suppressions=all \ --suppressions=src/tools/valgrind.supp --time-stamp=yes \ --error-markers=VALGRINDERROR-BEGIN,VALGRINDERROR-END \ --log-file=$HOME/pg-valgrind/%p.log --trace-children=yes \ postgres --log_line_prefix="%m %p " \ --log_statement=all --shared_buffers=64MB 2>&1 | tee $HOME/pg-valgrind/postmaster.log
. Run the regression tests:
make installcheck
If that detected an error, and more details are required for debugging purposes, it is then possible to rerun a smaller test case with "--track-origins=yes --read-var-info=yes" flags also added. That slows things noticeably but gives more specific messaging. For more information, see the Valgrind documentation.
Run individual subsets of the regression tests, to limit the duration of testing:
make installcheck-tests TESTS="json combocid"
Not all tests in the regression tests are capable of being run on their own like this - you may wish to verify that each test passes without Postgres running through Valgrind first. (The file src/test/regress/parallel_schedule should give you some idea of the dependencies that the test of interest may have on other tests).
Additional suppressions
You may need additional local suppressions. If you get complaints about wcstombs and related functions, consider this addition:
# wcsrtombs uses some clever optimizations internally, which to valgrind # may look like access to uninitialized data. For example AVX2 instructions # load data in 256-bit chunks, irrespectedly of wchar length. gconv does # somethink similar by loading data in 32bit chunks and then shifting the # data internally. Neither of those actually uses the uninitialized part # of the buffer, as far as we know. # # https://www.postgresql.org/message-id/90ac0452-e907-e7a4-b3c8-15bd33780e62@2ndquadrant.com { wcsnlen_optimized Memcheck:Cond fun:__wcsnlen_avx2 fun:wcsrtombs fun:wcstombs fun:wchar2char } { wcsnlen_optimized_addr32 Memcheck:Addr32 fun:__wcsnlen_avx2 fun:wcsrtombs fun:wcstombs fun:wchar2char } { gconv_transform_internal Memcheck:Cond fun:__gconv_transform_internal_utf8 fun:wcsrtombs fun:wcstombs fun:wchar2char }
(In my system, this is not enough, because wchar2char seems to be inlined, so I get additional reports about wcstombs being called by functions that call wchar2char. So you might need to remove the last line of each proposed suppression.)
Troubleshooting
If you get an error like
valgrind: mmap(0x58000000, 2347008) failed in UME with error 22 (Invalid argument). valgrind: this can be caused by executables with very large text, data or bss segments.
the most likely cuplrit is an attempt to run valgrind under valgrind, likely via a wrapper script. If you're using a wrapper script for pg_regress, for example, make sure you use which -a or change the PATH to find the next postgres, rather than re-executing your wrapper script.