One way to construct a test workload for purposes such as load testing and benchmarking is to record statements executed against a source server, then play them back against the test system. There are a few tools available for this purpose compatible with PostgreSQL.
Project home page: http://tsung.erlang-projects.org/
Tsung implements a multi-threaded model that tracks which transactions each query belonged to and runs them with the same concurrency as the original.
Tsung also comes with a recorder which is a PostgreSQL proxy. You can connect it to your server, connect your client to it, and let it record a session at a time. Then in the configuration you get to choose how many of each sessions you want to mix.
Note that Tsung is not really designed to replay logs. It's designed for artificial workload generation, so it's very limited in the number of sessions it can replay ... not more than 200 or so. Depending on your real workload, this may be an insurmountable obstacle.
Tsung tutorial on load-testing Postgresql servers:
- Load-testing a Postgresql server with Tsung
- pgTsung: app-specific testing methodology
- pgTsung: stage 1
- pgTsung: stage 2
Project home page: http://github.com/myYearbook/playr
Designed to identify potential issues resulting from software or hardware upgrades on myYearbook.com's high-volume databases, Playr is able test an entire application and provide the administrator with a report detailing the performance and error-related divergence of a Postgres configuration from the production workload. This sounds good, but how does it work? Playr works by capturing your production database workload, analyzing it, and processing it into a set of binary capture files. Once the workload capture and processing has been completed, replay can be performed using one or more driver systems. By using the binary capture files, Playr is able to recreate the production workload exactly; all concurrency, timing, and commit ordering remain identical to production.
Playr is not designed to work with lesser hardware than the machine where the log files were originally taken from. It is not designed to be a benchmark application in the traditional sense. If the new hardware can not keep up with the timing, Playr will give up its stress test and let you know that it fell behind.
Project home page: http://pgreplay.projects.postgresql.org/
pgreplay reads a PostgreSQL log file (*not* a WAL file), extracts the SQL statements and executes them in the same order and relative time against a PostgreSQL database cluster. This makes it straightforward to capture the data needed to drive it from a production server. pgreplay is single-threaded, but uses asynchronous query processing, so multiple connections can be handled simultaneously. pgreplay will use as many connections as the original run did, and query order and timing are retained. If the execution of statements gets behind schedule, warning messages are issued that indicate that the server cannot handle the load in a timely fashion. The idea is to replay a real-world database workload as exactly as possible.
- Should compile and run on any platform that PostgreSQL supports
- Can replay the workload at different speeds
- Can parse "stderr" and "csvlog" log files
- Can save workload to replay in "replay file" for reuse
- Statements that are not logged, such as COPY, will not be played back.
Project home page: https://github.com/gocardless/pgreplay-go/
This tool is a different take on the existing pgreplay project. Where pgreplay will playback Postgres log files while respecting relative chronological order, pgreplay-go plays back statements at approximately the same rate they were originally sent to the database.
- Provides Prometheus metrics