Unwarm

From PostgreSQL wiki
Jump to navigationJump to search

Projects that would be useful for testing caching.

1. A way to kick data out of PostgreSQL's buffer pool. You can always just stop it and start it again to kick everything out of the buffer pool, but we want something more fine grained.

1A. One way to do it. What about a new function pg_buffercache_invalidate(bufferid) in the pg_buffercache extension? Then you could control which buffers are invalidated by using a query against the pg_buffercache view. SELECT pg_buffercache_invalidate(bufferid) FROM pg_buffercache WHERE relfilenode = 1234, or you can be more or less specific. Perhaps it could return true or false to indicate if it succeeded. It can't invalidate buffers that are pinned, so perhaps it should just return false for those? In an initial version, it could also decline to invalidate dirty buffers, if that is simpler.

1B. Another way to do it. What about a function pg_prewarm(..., 'unbuffer', ...)? That would be the opposite of 'buffer'.

I am not sure which approach is best. Once you have the right sort of function exported from bufmgr.c it would be trivial to expose it to *both* of those extensions. (?)

2. A way to see which buffers are currently in the OS page cache. This could use mincore() on Linux (and other Unixes; mincore() is an old BSD/Solaris interface present on probably all Unixes), and VirtualQuery() on Windows. This could be a function that takes a relation name and returns data in a set-returning-function format. Here is a similar piece of code. Note that the pgfincore extension already exists (outside the tree), pgfincore does allow per page details and option to load/unload cache for each, this is a critical use case for demanding HA and testing (and was a motivation to for pgfincore...), with a BSD kernel or a patched linux, pgfincore also allows to check dirty state. See example below.

3. A way to ask the operating system to stop caching a relation. This could be done with posix_fadvise(POSIX_FADV_WONTNEED). It could be part of the pg_prewarm(..., 'wontneed', ...). That would involve figuring out how the pg_prewarm extension should access the files, which may be a difficult decision (= mailing list discussion). Two obvious ideas would be: (1) pg_prewarm directly opens the relation files and calls posix_fadvise(). (2) pg_prewarm calls new smgr.c routines smgrwontneed(), because really smgr is the module that is supposed to be in charge of files.


pgfincore example to achieve the warm/unwarm, each bit is a PostgreSQL page, so here we warm cache with the first 3 and remove the last 3 from cache (system cache, not shared buffers).

-- Loading and Unloading
cedric=# select * from pgfadvise_loader('pgbench_accounts', 0, true, true, B'111000');
     relpath      | os_page_size | os_pages_free | pages_loaded | pages_unloaded 
------------------+--------------+---------------+--------------+----------------
 base/11874/16447 |         4096 |        408376 |            3 |              3

(Pgfincore roadmap is to add cachestat() new syscall (linux 6.5) for cache inspection, and propose all of that to PostgreSQL tree if there are interest.)



Here is some previous discussion of these topics: