https://wiki.postgresql.org/api.php?action=feedcontributions&user=C2main&feedformat=atomPostgreSQL wiki - User contributions [en]2024-03-29T05:52:30ZUser contributionsMediaWiki 1.35.13https://wiki.postgresql.org/index.php?title=Talk:Operations_cheat_sheet&diff=38568Talk:Operations cheat sheet2024-01-18T06:25:23Z<p>C2main: Created page with "This is very nice work and at the same time, the content itself is version specific. The hard things is to keep the content aligned with PostgreSQL state when the page is read..."</p>
<hr />
<div>This is very nice work and at the same time, the content itself is version specific. The hard things is to keep the content aligned with PostgreSQL state when the page is read or to outline up to what version such statement is "true".<br />
<br />
An example:<br />
<br />
<pre><br />
Streaming replication<br />
<br />
Architecture<br />
<br />
Topology<br />
Only the entire database cluster is replicated. Partial replication is not possible.<br />
</pre><br />
<br />
It's false, logical streaming replication allows precisely that. (I know you know, but the new reader will be confused when precisely such cheat sheet is probably very valuable for her).<br />
<br />
Maybe adding some time or version reference will be enough....</div>C2mainhttps://wiki.postgresql.org/index.php?title=Readahead&diff=38567Readahead2024-01-17T16:14:00Z<p>C2main: review</p>
<hr />
<div><br />
= in Linux =<br />
<br />
== Summary ==<br />
<br />
; Very concise summary<br />
: readahead does read.<br />
<br />
;Concise summary<br />
: readahead in linux is more similar to readv implementation in PostgreSQL (at least the stream IO implemented by Thomas Munro for pg17).<br />
: It might be easier to read it this way to understand how it works.<br />
<br />
;Important points<br />
: readahead is not async<br />
: readahead is soon filesystem specific, see bottom of call tree below.<br />
<br />
On linux, readahead as evolved over time and the last known review has been done by Neil Brown around April 8, 2022.<br/><br />
He wrote an article to summarize what he discovered: [https://lwn.net/Articles/888715/ Readahead: the documentation I wanted to read].<br />
<br />
Which lead to this updated doc (apparently merge in 5.18): https://www.kernel.org/doc/html/latest/core-api/mm-api.html#readahead<br />
<br />
This is indeed a very nice summary and it covers mostly 2 entry functions for readahead: ''page_cache_sync_ra()'' and ''page_cache_async_ra()''.<br />
<br />
Both have a nice up to date documentation about their underlying functions: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=page_cache_async_readahead#c.page_cache_sync_readahead<br />
<br />
----<br />
<br />
Take a moment to read the documentation (excerpt here):<br />
<br />
; page_cache_sync_readahead()<br />
: should be called when a cache miss happened: '''it will submit the read'''. The readahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will improve performance.<br />
<br />
; page_cache_async_readahead()<br />
: should be called when a page is used which is marked as PageReadahead; this is a marker to suggest that the application has used up enough of the readahead window that we should start pulling in more pages.</q><br />
<br />
OK, so far so good.<br />
<br />
But the code when doing ''posix_fadvise'' is distinct, and the entry point lead to:<br />
<br />
; page_cache_ra_unbounded()<br />
: This function is for filesystems to call when they want to start readahead beyond a file's stated ''i_size''. This is almost certainly not the function you want to call. Use ''page_cache_async_readahead()'' or ''page_cache_sync_readahead()'' instead. File is referenced by caller. Mutexes may be held by caller. '''May sleep, but will not reenter filesystem to reclaim memory.'''<br />
<br />
Today, linux is using "folios", and the readahead flag is set via ''folio_set_readahead(folio);'' in this function. <br />
<br />
Other super important part is: "can wait", yes IT IS NOT ASYNCHRONOUS. The only situation where it's partly asynchronous is when the storage is really congested and so during the execution of the call on the provided readahead range, the process is aborted. As mentioned by Neil the "congestion" has not been followed everywhere and may not work as expected.<br />
<br />
Side note: linux split range in chunks of 2MB to manage memory and reduce locking. Hardcoded.<br />
<br />
An excerpt from the comment in the code of [https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L27 readahead]: <br />
<br />
/*<br />
* Each readahead request is partly synchronous read, and partly async<br />
* readahead. This is reflected in the struct file_ra_state which<br />
* contains ->size being the total number of pages, and ->async_size<br />
* which is the number of pages in the async section. The readahead<br />
* flag will be set on the first folio in this async section to trigger<br />
* a subsequent readahead. Once a series of sequential reads has been<br />
* established, there should be no need for a synchronous component and<br />
* all readahead request will be fully asynchronous.<br />
*/<br />
<br />
== Call tree - The functions really used by linux posix_fadvise ==<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L31<br />
<br />
/*<br />
* POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could<br />
* deactivate the pages and clear PG_Referenced.<br />
*/<br />
int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/internal.h#L126<br />
<br />
inline wrapper<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L306<br />
<br />
/*<br />
* Chunk the readahead into 2 megabyte units, so that we don't pin too much<br />
* memory at once.<br />
*/<br />
void force_page_cache_ra(struct readahead_control *ractl,<br />
unsigned long nr_to_read)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L281<br />
<br />
/*<br />
* do_page_cache_ra() actually reads a chunk of disk. It allocates<br />
* the pages first, then submits them for I/O. This avoids the very bad<br />
* behaviour which would occur if page allocations are causing VM writeback.<br />
* We really don't want to intermingle reads and writes like that.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L205<br />
<br />
/**<br />
* page_cache_ra_unbounded - Start unchecked readahead.<br />
* @ractl: Readahead control.<br />
* @nr_to_read: The number of pages to read.<br />
* @lookahead_size: Where to start the next readahead.<br />
*<br />
* This function is for filesystems to call when they want to start<br />
* readahead beyond a file's stated i_size. This is almost certainly<br />
* not the function you want to call. Use page_cache_async_readahead()<br />
* or page_cache_sync_readahead() instead.<br />
*<br />
* Context: File is referenced by caller. Mutexes may be held by caller.<br />
* May sleep, but will not reenter filesystem to reclaim memory.<br />
*/ <br />
<br />
some interesting comments in it, during preallocation:<br />
<br />
/*<br />
* Partway through the readahead operation, we will have added<br />
* locked pages to the page cache, but will not yet have submitted<br />
* them for I/O. Adding another page may need to allocate memory,<br />
* which can trigger memory reclaim. Telling the VM we're in<br />
* the middle of a filesystem operation will cause it to not<br />
* touch file-backed pages, preventing a deadlock. Most (all?)<br />
* filesystems already specify __GFP_NOFS in their mapping's<br />
* gfp_mask, but let's be explicit here.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L146<br />
<br />
static void read_pages(struct readahead_control *rac)<br />
<br />
Then it is filesystem specific.<br />
<br />
=== EXT4 ===<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L3124<br />
<br />
static int ext4_read_folio(struct file *file, struct folio *folio)<br />
<br />
if not found in mem:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/readpage.c#L211<br />
<br />
int ext4_mpage_readpages(struct inode *inode,<br />
struct readahead_control *rac, struct folio *folio)<br />
<br />
after lot of logic around sequential, hole, ...:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L833<br />
<br />
<br />
/**<br />
* submit_bio - submit a bio to the block device layer for I/O<br />
* @bio: The &struct bio which describes the I/O<br />
*<br />
* submit_bio() is used to submit I/O requests to block devices. It is passed a<br />
* fully set up &struct bio that describes the I/O that needs to be done. The<br />
* bio will be send to the device described by the bi_bdev field.<br />
*<br />
* The success/failure status of the request, along with notification of<br />
* completion, is delivered asynchronously through the ->bi_end_io() callback<br />
* in @bio. The bio must NOT be touched by the caller until ->bi_end_io() has<br />
* been called.<br />
*/<br />
<br />
== First thoughts ==<br />
<br />
* having a WILLNEED on a sequential pattern just competes with linux own ra.<br />
<br />
* if PG_readahead is set, linux will interpret that as a successful past readahead and may keep on doing more ra from this block when reading (it's mitigated by some hole detection apparently).<br />
* if PG_readahead is not set when reading, not checked how the ra logic will use that.<br />
<br />
* having a DONTNEED just after a read is apparently optimized in linux code.<br />
<br />
* setting RANDOM or SEQUENTIAL flag influence the linux default ra effectively. And is costless.<br />
<br />
<br />
[[Category:development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Readahead&diff=38566Readahead2024-01-17T16:11:06Z<p>C2main: smarter very concise summary</p>
<hr />
<div><br />
= in Linux =<br />
<br />
== Summary ==<br />
<br />
; Very concise summary<br />
: readahead does read.<br />
<br />
;Concise summary<br />
: readahead in linux is more similar to readv implementation in PostgreSQL (at least the stream IO implemented by Thomas Munro for pg17).<br />
: It might be easier to read it this way to understand how it works.<br />
<br />
;Important points<br />
: readahead is not async<br />
: readahead is soon filesystem specific, see bottom of call tree below.<br />
<br />
On linux, readahead as evolved over time and the last known review has been done by Neil Brown around April 8, 2022.<br/><br />
He wrote an article to summarize what he discovered: [https://lwn.net/Articles/888715/ Readahead: the documentation I wanted to read].<br />
<br />
Which lead to this updated doc (apparently merge in 5.18): https://www.kernel.org/doc/html/latest/core-api/mm-api.html#readahead<br />
<br />
This is indeed a very nice summary and it covers mostly 2 entry functions for readahead: ''page_cache_sync_ra()'' and ''page_cache_async_ra()''.<br />
<br />
Both have a nice up to date documentation about their underlying functions: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=page_cache_async_readahead#c.page_cache_sync_readahead<br />
<br />
----<br />
<br />
Take a moment to read the documentation (excerpt here):<br />
<br />
; page_cache_sync_readahead()<br />
: should be called when a cache miss happened: '''it will submit the read'''. The readahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will improve performance.<br />
<br />
; page_cache_async_readahead()<br />
: should be called when a page is used which is marked as PageReadahead; this is a marker to suggest that the application has used up enough of the readahead window that we should start pulling in more pages.</q><br />
<br />
OK, so far so good.<br />
<br />
But the code when doing ''posix_fadvise'' is distinct, and the entry point is:<br />
<br />
; page_cache_ra_unbounded()<br />
: This function is for filesystems to call when they want to start readahead beyond a file's stated ''i_size''. This is almost certainly not the function you want to call. Use ''page_cache_async_readahead()'' or ''page_cache_sync_readahead()'' instead. File is referenced by caller. Mutexes may be held by caller. '''May sleep, but will not reenter filesystem to reclaim memory.'''<br />
<br />
<br />
A part of the comment is explained in the code: this function apparently does not mark pages as ''PG_readahead'', but the others two before do. This mark on page in linux VM is a hint: if the page is read and was marked as ''PG_readahead'', then the logic is good and linux can keep on doing prefetching. Today, linux is using "folios", and the readahead flag is set via ''folio_set_readahead(folio);'' in [https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L259 page_cache_ra_unbounded()].<br />
<br />
Other super important part is: "can wait", yes IT IS NOT ASYNCHRONOUS. The only situation where it's partly asynchronous is when the storage is really congested and so during the execution of the call on the provided readahead range, the process is aborted. As mentioned by Neil the "congestion" has not been followed everywhere and may not work as expected.<br />
<br />
Side note: linux split range in chunks of 2MB to manage memory and reduce locking. Hardcoded.<br />
<br />
An excerpt from the comment in the code of [https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L27 readahead]: <br />
<br />
/*<br />
* Each readahead request is partly synchronous read, and partly async<br />
* readahead. This is reflected in the struct file_ra_state which<br />
* contains ->size being the total number of pages, and ->async_size<br />
* which is the number of pages in the async section. The readahead<br />
* flag will be set on the first folio in this async section to trigger<br />
* a subsequent readahead. Once a series of sequential reads has been<br />
* established, there should be no need for a synchronous component and<br />
* all readahead request will be fully asynchronous.<br />
*/<br />
<br />
== Call tree - The functions really used by linux posix_fadvise ==<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L31<br />
<br />
/*<br />
* POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could<br />
* deactivate the pages and clear PG_Referenced.<br />
*/<br />
int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/internal.h#L126<br />
<br />
inline wrapper<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L306<br />
<br />
/*<br />
* Chunk the readahead into 2 megabyte units, so that we don't pin too much<br />
* memory at once.<br />
*/<br />
void force_page_cache_ra(struct readahead_control *ractl,<br />
unsigned long nr_to_read)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L281<br />
<br />
/*<br />
* do_page_cache_ra() actually reads a chunk of disk. It allocates<br />
* the pages first, then submits them for I/O. This avoids the very bad<br />
* behaviour which would occur if page allocations are causing VM writeback.<br />
* We really don't want to intermingle reads and writes like that.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L205<br />
<br />
/**<br />
* page_cache_ra_unbounded - Start unchecked readahead.<br />
* @ractl: Readahead control.<br />
* @nr_to_read: The number of pages to read.<br />
* @lookahead_size: Where to start the next readahead.<br />
*<br />
* This function is for filesystems to call when they want to start<br />
* readahead beyond a file's stated i_size. This is almost certainly<br />
* not the function you want to call. Use page_cache_async_readahead()<br />
* or page_cache_sync_readahead() instead.<br />
*<br />
* Context: File is referenced by caller. Mutexes may be held by caller.<br />
* May sleep, but will not reenter filesystem to reclaim memory.<br />
*/ <br />
<br />
some interesting comments in it, during preallocation:<br />
<br />
/*<br />
* Partway through the readahead operation, we will have added<br />
* locked pages to the page cache, but will not yet have submitted<br />
* them for I/O. Adding another page may need to allocate memory,<br />
* which can trigger memory reclaim. Telling the VM we're in<br />
* the middle of a filesystem operation will cause it to not<br />
* touch file-backed pages, preventing a deadlock. Most (all?)<br />
* filesystems already specify __GFP_NOFS in their mapping's<br />
* gfp_mask, but let's be explicit here.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L146<br />
<br />
static void read_pages(struct readahead_control *rac)<br />
<br />
Then it is filesystem specific.<br />
<br />
=== EXT4 ===<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L3124<br />
<br />
static int ext4_read_folio(struct file *file, struct folio *folio)<br />
<br />
if not found in mem:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/readpage.c#L211<br />
<br />
int ext4_mpage_readpages(struct inode *inode,<br />
struct readahead_control *rac, struct folio *folio)<br />
<br />
after lot of logic around sequential, hole, ...:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L833<br />
<br />
<br />
/**<br />
* submit_bio - submit a bio to the block device layer for I/O<br />
* @bio: The &struct bio which describes the I/O<br />
*<br />
* submit_bio() is used to submit I/O requests to block devices. It is passed a<br />
* fully set up &struct bio that describes the I/O that needs to be done. The<br />
* bio will be send to the device described by the bi_bdev field.<br />
*<br />
* The success/failure status of the request, along with notification of<br />
* completion, is delivered asynchronously through the ->bi_end_io() callback<br />
* in @bio. The bio must NOT be touched by the caller until ->bi_end_io() has<br />
* been called.<br />
*/<br />
<br />
== First thoughts ==<br />
<br />
* having a WILLNEED on a sequential pattern just competes with linux own ra.<br />
<br />
* if PG_readahead is set, linux will interpret that as a successful past readahead and may keep on doing more ra from this block when reading (it's mitigated by some hole detection apparently).<br />
* if PG_readahead is not set when reading, not checked how the ra logic will use that.<br />
<br />
* having a DONTNEED just after a read is apparently optimized in linux code.<br />
<br />
* setting RANDOM or SEQUENTIAL flag influence the linux default ra effectively. And is costless.<br />
<br />
<br />
[[Category:development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Readahead&diff=38565Readahead2024-01-17T16:10:31Z<p>C2main: adjust some content and remove the false idea that doing posix_fadvise might not set the PG_readahead flag.</p>
<hr />
<div><br />
= in Linux =<br />
<br />
== Summary ==<br />
<br />
; Very concise summary<br />
: readahead is a buzzword.<br />
<br />
;Concise summary<br />
: readahead in linux is more similar to readv implementation in PostgreSQL (at least the stream IO implemented by Thomas Munro for pg17).<br />
: It might be easier to read it this way to understand how it works.<br />
<br />
;Important points<br />
: readahead is not async<br />
: readahead is soon filesystem specific, see bottom of call tree below.<br />
<br />
On linux, readahead as evolved over time and the last known review has been done by Neil Brown around April 8, 2022.<br/><br />
He wrote an article to summarize what he discovered: [https://lwn.net/Articles/888715/ Readahead: the documentation I wanted to read].<br />
<br />
Which lead to this updated doc (apparently merge in 5.18): https://www.kernel.org/doc/html/latest/core-api/mm-api.html#readahead<br />
<br />
This is indeed a very nice summary and it covers mostly 2 entry functions for readahead: ''page_cache_sync_ra()'' and ''page_cache_async_ra()''.<br />
<br />
Both have a nice up to date documentation about their underlying functions: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=page_cache_async_readahead#c.page_cache_sync_readahead<br />
<br />
----<br />
<br />
Take a moment to read the documentation (excerpt here):<br />
<br />
; page_cache_sync_readahead()<br />
: should be called when a cache miss happened: '''it will submit the read'''. The readahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will improve performance.<br />
<br />
; page_cache_async_readahead()<br />
: should be called when a page is used which is marked as PageReadahead; this is a marker to suggest that the application has used up enough of the readahead window that we should start pulling in more pages.</q><br />
<br />
OK, so far so good.<br />
<br />
But the code when doing ''posix_fadvise'' is distinct, and the entry point is:<br />
<br />
; page_cache_ra_unbounded()<br />
: This function is for filesystems to call when they want to start readahead beyond a file's stated ''i_size''. This is almost certainly not the function you want to call. Use ''page_cache_async_readahead()'' or ''page_cache_sync_readahead()'' instead. File is referenced by caller. Mutexes may be held by caller. '''May sleep, but will not reenter filesystem to reclaim memory.'''<br />
<br />
<br />
A part of the comment is explained in the code: this function apparently does not mark pages as ''PG_readahead'', but the others two before do. This mark on page in linux VM is a hint: if the page is read and was marked as ''PG_readahead'', then the logic is good and linux can keep on doing prefetching. Today, linux is using "folios", and the readahead flag is set via ''folio_set_readahead(folio);'' in [https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L259 page_cache_ra_unbounded()].<br />
<br />
Other super important part is: "can wait", yes IT IS NOT ASYNCHRONOUS. The only situation where it's partly asynchronous is when the storage is really congested and so during the execution of the call on the provided readahead range, the process is aborted. As mentioned by Neil the "congestion" has not been followed everywhere and may not work as expected.<br />
<br />
Side note: linux split range in chunks of 2MB to manage memory and reduce locking. Hardcoded.<br />
<br />
An excerpt from the comment in the code of [https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L27 readahead]: <br />
<br />
/*<br />
* Each readahead request is partly synchronous read, and partly async<br />
* readahead. This is reflected in the struct file_ra_state which<br />
* contains ->size being the total number of pages, and ->async_size<br />
* which is the number of pages in the async section. The readahead<br />
* flag will be set on the first folio in this async section to trigger<br />
* a subsequent readahead. Once a series of sequential reads has been<br />
* established, there should be no need for a synchronous component and<br />
* all readahead request will be fully asynchronous.<br />
*/<br />
<br />
== Call tree - The functions really used by linux posix_fadvise ==<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L31<br />
<br />
/*<br />
* POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could<br />
* deactivate the pages and clear PG_Referenced.<br />
*/<br />
int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/internal.h#L126<br />
<br />
inline wrapper<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L306<br />
<br />
/*<br />
* Chunk the readahead into 2 megabyte units, so that we don't pin too much<br />
* memory at once.<br />
*/<br />
void force_page_cache_ra(struct readahead_control *ractl,<br />
unsigned long nr_to_read)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L281<br />
<br />
/*<br />
* do_page_cache_ra() actually reads a chunk of disk. It allocates<br />
* the pages first, then submits them for I/O. This avoids the very bad<br />
* behaviour which would occur if page allocations are causing VM writeback.<br />
* We really don't want to intermingle reads and writes like that.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L205<br />
<br />
/**<br />
* page_cache_ra_unbounded - Start unchecked readahead.<br />
* @ractl: Readahead control.<br />
* @nr_to_read: The number of pages to read.<br />
* @lookahead_size: Where to start the next readahead.<br />
*<br />
* This function is for filesystems to call when they want to start<br />
* readahead beyond a file's stated i_size. This is almost certainly<br />
* not the function you want to call. Use page_cache_async_readahead()<br />
* or page_cache_sync_readahead() instead.<br />
*<br />
* Context: File is referenced by caller. Mutexes may be held by caller.<br />
* May sleep, but will not reenter filesystem to reclaim memory.<br />
*/ <br />
<br />
some interesting comments in it, during preallocation:<br />
<br />
/*<br />
* Partway through the readahead operation, we will have added<br />
* locked pages to the page cache, but will not yet have submitted<br />
* them for I/O. Adding another page may need to allocate memory,<br />
* which can trigger memory reclaim. Telling the VM we're in<br />
* the middle of a filesystem operation will cause it to not<br />
* touch file-backed pages, preventing a deadlock. Most (all?)<br />
* filesystems already specify __GFP_NOFS in their mapping's<br />
* gfp_mask, but let's be explicit here.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L146<br />
<br />
static void read_pages(struct readahead_control *rac)<br />
<br />
Then it is filesystem specific.<br />
<br />
=== EXT4 ===<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L3124<br />
<br />
static int ext4_read_folio(struct file *file, struct folio *folio)<br />
<br />
if not found in mem:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/readpage.c#L211<br />
<br />
int ext4_mpage_readpages(struct inode *inode,<br />
struct readahead_control *rac, struct folio *folio)<br />
<br />
after lot of logic around sequential, hole, ...:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L833<br />
<br />
<br />
/**<br />
* submit_bio - submit a bio to the block device layer for I/O<br />
* @bio: The &struct bio which describes the I/O<br />
*<br />
* submit_bio() is used to submit I/O requests to block devices. It is passed a<br />
* fully set up &struct bio that describes the I/O that needs to be done. The<br />
* bio will be send to the device described by the bi_bdev field.<br />
*<br />
* The success/failure status of the request, along with notification of<br />
* completion, is delivered asynchronously through the ->bi_end_io() callback<br />
* in @bio. The bio must NOT be touched by the caller until ->bi_end_io() has<br />
* been called.<br />
*/<br />
<br />
== First thoughts ==<br />
<br />
* having a WILLNEED on a sequential pattern just competes with linux own ra.<br />
<br />
* if PG_readahead is set, linux will interpret that as a successful past readahead and may keep on doing more ra from this block when reading (it's mitigated by some hole detection apparently).<br />
* if PG_readahead is not set when reading, not checked how the ra logic will use that.<br />
<br />
* having a DONTNEED just after a read is apparently optimized in linux code.<br />
<br />
* setting RANDOM or SEQUENTIAL flag influence the linux default ra effectively. And is costless.<br />
<br />
<br />
[[Category:development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Readahead&diff=38562Readahead2024-01-14T15:07:33Z<p>C2main: damned mediawiki syntax for links...</p>
<hr />
<div>WIP<br />
<br />
= in Linux =<br />
<br />
== Summary ==<br />
<br />
; Very concise summary<br />
: readahead is a buzzword.<br />
<br />
;Concise summary<br />
: readahead in linux is more similar to readv implementation in PostgreSQL (at least the stream IO implemented by Thomas Munro for pg17).<br />
: It might be easier to read it this way to understand how it works.<br />
<br />
;Important points<br />
: readahead is not async<br />
: readahead is soon filesystem specific, see bottom of call tree below.<br />
<br />
On linux, readahead as evolved over time and the last known review has been done by Neil Brown around April 8, 2022.<br/><br />
He wrote an article to summarize what he discovered: [https://lwn.net/Articles/888715/ Readahead: the documentation I wanted to read].<br />
<br />
Which lead to this updated doc (apparently merge in 5.18): https://www.kernel.org/doc/html/latest/core-api/mm-api.html#readahead<br />
<br />
This is indeed a very nice summary and it covers mostly 2 entry functions for readahead: ''page_cache_sync_ra()'' and ''page_cache_async_ra()''.<br />
<br />
Both have a nice up to date documentation about their underlying functions: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=page_cache_async_readahead#c.page_cache_sync_readahead<br />
<br />
----<br />
<br />
Take a moment to read the documentation (excerpt here):<br />
<br />
; page_cache_sync_readahead()<br />
: should be called when a cache miss happened: '''it will submit the read'''. The readahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will improve performance.<br />
<br />
; page_cache_async_readahead()<br />
: should be called when a page is used which is marked as PageReadahead; this is a marker to suggest that the application has used up enough of the readahead window that we should start pulling in more pages.</q><br />
<br />
OK, so far so good.<br />
<br />
But the code when doing ''posix_fadvise'' is distinct, and the entry point is:<br />
<br />
; page_cache_ra_unbounded()<br />
: This function is for filesystems to call when they want to start readahead beyond a file's stated ''i_size''. This is almost certainly not the function you want to call. Use ''page_cache_async_readahead()'' or ''page_cache_sync_readahead()'' instead. File is referenced by caller. Mutexes may be held by caller. '''May sleep, but will not reenter filesystem to reclaim memory.'''<br />
<br />
<br />
A part of the comment is explained in the code: this function apparently does not mark pages as ''PG_readahead'', but the others two before do. This mark on page in linux VM is a hint: if the page is read and was marked as ''PG_readahead'', then the logic is good and linux can keep on doing prefetching.<br />
<br />
Other super important part is: "can wait", yes IT IS NOT ASYNCHRONOUS. The only situation where it's partly asynchronous is when the storage is really congested and so during the execution of the call on the provided readahead range, the process is aborted. As mentioned by Neil the "congestion" has not been followed everywhere and may not work as expected.<br />
<br />
Side note: linux split range in chunks of 2MB to manage memory and reduce locking. Hardcoded.<br />
<br />
== Call tree - The functions really used by linux posix_fadvise ==<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L31<br />
<br />
/*<br />
* POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could<br />
* deactivate the pages and clear PG_Referenced.<br />
*/<br />
int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/internal.h#L126<br />
<br />
inline wrapper<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L306<br />
<br />
/*<br />
* Chunk the readahead into 2 megabyte units, so that we don't pin too much<br />
* memory at once.<br />
*/<br />
void force_page_cache_ra(struct readahead_control *ractl,<br />
unsigned long nr_to_read)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L281<br />
<br />
/*<br />
* do_page_cache_ra() actually reads a chunk of disk. It allocates<br />
* the pages first, then submits them for I/O. This avoids the very bad<br />
* behaviour which would occur if page allocations are causing VM writeback.<br />
* We really don't want to intermingle reads and writes like that.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L205<br />
<br />
/**<br />
* page_cache_ra_unbounded - Start unchecked readahead.<br />
* @ractl: Readahead control.<br />
* @nr_to_read: The number of pages to read.<br />
* @lookahead_size: Where to start the next readahead.<br />
*<br />
* This function is for filesystems to call when they want to start<br />
* readahead beyond a file's stated i_size. This is almost certainly<br />
* not the function you want to call. Use page_cache_async_readahead()<br />
* or page_cache_sync_readahead() instead.<br />
*<br />
* Context: File is referenced by caller. Mutexes may be held by caller.<br />
* May sleep, but will not reenter filesystem to reclaim memory.<br />
*/ <br />
<br />
some interesting comments in it, during preallocation:<br />
<br />
/*<br />
* Partway through the readahead operation, we will have added<br />
* locked pages to the page cache, but will not yet have submitted<br />
* them for I/O. Adding another page may need to allocate memory,<br />
* which can trigger memory reclaim. Telling the VM we're in<br />
* the middle of a filesystem operation will cause it to not<br />
* touch file-backed pages, preventing a deadlock. Most (all?)<br />
* filesystems already specify __GFP_NOFS in their mapping's<br />
* gfp_mask, but let's be explicit here.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L146<br />
<br />
static void read_pages(struct readahead_control *rac)<br />
<br />
Then it is filesystem specific.<br />
<br />
=== EXT4 ===<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L3124<br />
<br />
static int ext4_read_folio(struct file *file, struct folio *folio)<br />
<br />
if not found in mem:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/readpage.c#L211<br />
<br />
int ext4_mpage_readpages(struct inode *inode,<br />
struct readahead_control *rac, struct folio *folio)<br />
<br />
after lot of logic around sequential, hole, ...:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L833<br />
<br />
<br />
/**<br />
* submit_bio - submit a bio to the block device layer for I/O<br />
* @bio: The &struct bio which describes the I/O<br />
*<br />
* submit_bio() is used to submit I/O requests to block devices. It is passed a<br />
* fully set up &struct bio that describes the I/O that needs to be done. The<br />
* bio will be send to the device described by the bi_bdev field.<br />
*<br />
* The success/failure status of the request, along with notification of<br />
* completion, is delivered asynchronously through the ->bi_end_io() callback<br />
* in @bio. The bio must NOT be touched by the caller until ->bi_end_io() has<br />
* been called.<br />
*/<br />
<br />
== First thoughts ==<br />
<br />
* having a WILLNEED on asequential pattern may just compete with linux own ra.<br />
* having a WILLNEED apparently does not set PG_readahead<br />
<br />
* if PG_readahead is set, linux will interpret that as a successfull sequential read ahead pattern and may keep on doing more ra from this block when reading (it's mitigated by some hole detection apparently).<br />
* if PG_readahead is not set when reading, not checked how the ra logic will use that.<br />
* having a DONTNEED just after a read is apparently optimized in linux code.<br />
<br />
* setting RANDOM or SEQUENTIAL flag looks influence the linux default ra effectively. And is costless.<br />
<br />
* it's possible that when pg prefetches pages, it reduces linux ra kicking-in, assuming PG_readahead is not set for such page when read.<br />
<br />
So I wonder if the improved performance is not due to prefetching but because it prevents linux from running readahead as the block visited is in cache but not marked PG_readahead. <br/><br />
It works by helping linux defeating its RA logic ?<br />
<br />
But we can do with thousands less syscall: just open file and advice kernel that it'll be RANDOM, it'll reduces linux readahead logic. <br/><br />
Then flagging SEQUENTIAL when doing a BULKREAD or similar...<br />
<br />
[[Category:development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=AIO&diff=38561AIO2024-01-14T14:07:24Z<p>C2main: Adding a link to readahead page and mention current prefetching is not async.</p>
<hr />
<div>Effort to add asynchronous I/O and optional direct I/O to PostgreSQL.<br />
<br />
= Resources =<br />
<br />
* [https://github.com/anarazel/postgres/tree/aio Development branch]<br />
* [https://www.postgresql.org/message-id/flat/20210223100344.llw5an2aklengrmn%40alap3.anarazel.de Discussion on pgsql-hackers]<br />
<br />
= Upstreamed patches =<br />
<br />
Progress moving work from the AIO branch into PostgreSQL...<br />
<br />
== Committed ==<br />
<br />
* [https://commitfest.postgresql.org/33/3147/ O_DIRECT on macOS]<br />
** So that Mac hackers can test io_data_direct=on (worker mode or posix_aio mode)<br />
* [https://commitfest.postgresql.org/31/2885/ pg_pwritev() and pg_preadv()]<br />
** Portable support for synchronous scatter/gather I/O<br />
* [https://commitfest.postgresql.org/32/3008/ Replace buffer I/O locks with condition variables]<br />
** So that backends other that the one that drains an I/O from the kernel can wait for it to complete<br />
* [https://commitfest.postgresql.org/41/3999/ Aligned memory allocation]<br />
** As required for buffers used in direct I/O<br />
* [https://commitfest.postgresql.org/42/3996/ Direct I/O GUC]<br />
** Released in 16 as debug_io_direct, to avoid attracting too much attention<br />
* [https://commitfest.postgresql.org/34/2962/ Fix DROP TABLESPACE on Windows with ProcSignalBarrier]<br />
** This is actually an ancient bug in PostgreSQL on Windows, but it was a bit harder to hit; make check fails every time under io_method=worker because processes hang around holding cached fds.<br />
* [https://commitfest.postgresql.org/42/3993/ Refactor relation extension]<br />
** This includes work to allow a backend to have multiple buffers in BM_IO_IN_PROGRESS state.<br />
<br />
== Proposed ==<br />
<br />
* [https://commitfest.postgresql.org/44/4532/ Streaming I/O, vectored I/O]<br />
** Introduce the concept of streams of buffers, initially for more efficient synchronous I/O<br />
<br />
= TODO =<br />
<br />
== General ==<br />
<br />
* get synchronous optimisation to work on all implementations<br />
** if we know we're going to submit exactly one IO and then wait for it (see "will_wait parameter"), switch to a regular preadv()/pwritev()/fsync()/fdatasync() syscall, which may be a little more efficient<br />
** made initially for io_method=worker, but needs more work to work for the other methods<br />
* partition_prune failing in CI due to compiler bug in specific MSVC release<br />
** [https://www.postgresql.org/message-id/CAApHDvof3AEVFZTT3MArh4Tq1cAuCJb_wk8u7FV5KMzJNhU4GQ%40mail.gmail.com discussion on -hackers]<br />
** [https://github.com/cirruslabs/docker-images-windows/issues/23 ticket to update compiler on CI] <br />
** drop commit "XXX Add temporary workaround for partition_prune test on Windows"<br />
* there might be IOs that this backend retried (and thus submitted) that aren't on our issued list, so pgaio_postmaster_before_child_exit() won't wait for them, and some kernels could cancel/forget the IOs<br />
** TODO note added to pgaio_postmaster_before_child_exit()<br />
** TODO the posix_aio implementation has a .postmaster_before_child_exit callback to fix this locally (otherwise Macs fail on CI due to random retries)<br />
* how should flush_range op be implemented on non-Linux?<br />
* write documentation, per-OS information<br />
* can we cut down on the number of places where we do non-blocking drain, for the benefit of implementations where that might be a system call?<br />
* incorporate Melanie's EXPLAIN changes<br />
* replication is slow with wal_sync_method=open_datasync, because we don't call WalSndWakeupRequest(); this explains why eg the subscription tests are super slow on macOS on CI (macOS defaults to open_datasync)<br />
* we're still carrying some obsolete code for dealing with macOS F_NOCACHE, which has been upstreamed (differently)<br />
* allow to use AIO interface for temporary tables, to avoid / reduce code duplication (see e.g. RelationUsesLocalBuffers() path in ReadBufferAsync())<br />
* fixed<br />
** DONE: occasional assertion failure "issued_abandoned_count == 0" [https://github.com/anarazel/postgres/commit/f69d4da72f0b9904a7accfd3577692fe6c88fadf fixed, squashed]<br />
** DONE: "buffer beyond EOF" [https://github.com/anarazel/postgres/commit/e9af58f65c761e734824b81a74da2114f4ea5375 fixed, squashed]<br />
** DONE: reference leak warnings from checkpointer [https://github.com/anarazel/postgres/commit/ffa555107a0b48054cf5cf84ffb384fa37d7c3fa fixed, squashed]<br />
<br />
== Larger Issues ==<br />
<br />
* local callbacks can lead to too deep recursion - callbacks likely shouldn't be allowed to wait for IO<br />
* ownership tracking of IOs is too complicated and yet not quite good enough<br />
* Streaming read / write interface are "too local" to specific users. A bit more backend global awareness would likely be a good idea<br />
* currently PostgreSQL does some prefetching with posix_fadvise_WILLNEED which is not asynchronous. See [[Readahead]].<br />
<br />
== io_method=worker ==<br />
<br />
* self-adjusting IO worker pool?<br />
* more work on the spurious-wakeup vs latency tradeoff<br />
* bincheck failing on CI on Windows/worker<br />
** pg_basebackup: error: could not initiate base backup: ERROR: could not stat file or directory "C:\Users\ContainerAdministrator\AppData\Local\Temp\sTWg233xJQ\tempdir\tblspc1/PG_15_202108031/12762/16388": Permission denied<br />
** [https://www.postgresql.org/message-id/flat/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com Bug in master?]<br />
* IO workers cache file descriptors for relations that have been dropped, and then could be confused if the relfilenode is recycled! that's because they don't obtain relation locks or participate in sinval<br />
** the bgwriter's approach to this problem [https://www.postgresql.org/message-id/CA%2BhUKGLs554tQFCUjv_vn7ft9Xv5LNjPoAd--3Df%2BJJKJ7A8kw%40mail.gmail.com is actually broken in master] and certainly wouldn't work here<br />
** perhaps there is a way to rely on the caller's use of sinval and locks, putting extra information that could be used for cache invalidation into the IO<br />
<br />
== io_method=io_uring ==<br />
<br />
* move method-specific stuff into io->io_method_data.io_uring<br />
* there's a couple more #ifdef USE_LIBRING bits in aio.c that could be kicked out into pgaio_impl->something()?<br />
* locking is too heavyweight <br />
<br />
== io_method=posix_aio ==<br />
<br />
* add an "interruptible" field in shmem that can be used to avoid useless wakeups while the submitter is running synchronous IOs or already draining, with some double-checked flags to avoid races?<br />
* detect presence of POSIX AIO automatically so you don't have to build --with-posix-aio<br />
** would be good to pass smoke tests on all known POSIX AIO implementations before we do that; results so far:<br />
*** Successes!<br />
**** FreeBSD<br />
**** Linux with Glibc and Musl<br />
**** illumos and Solaris<br />
**** macOS<br />
***** need to bump up sysctl limits to get decent performance<br />
**** AIX<br />
***** needs shared_memory_type=sysv<br />
***** aio_nwait() would be a better wait/reap interface than aio_suspend(), but sadly it can't wait for aio_fsync()<br />
****** possible workaround: could use aio_nwait() whenever there are no aio_fsync calls outstanding<br />
*** Failures, that will probably need to be excluded by default in early versions:<br />
**** NetBSD 9 seems to spin eating 100% CPU in lio_listio() :-(<br />
***** clue: an earlier iteration of this code was able to pass tests on NetBSD, when we were using aio_read(), aio_write() instead of lio_listio(), and when we were using SIGEV_SIGNAL instead of SIGEV_NONE<br />
***** no bug report filed yet, but a NetBSD developer advised me not to try to use this, it's not ready<br />
** if keeping it as a configure option, it should be "enable/disable", not "with/without"<br />
* fixed<br />
** [https://cirrus-ci.com/task/6024059377942528?logs=cores#L7 crash because aio_suspend() sees EINVAL], because "activate" LIO IOCB too soon, [https://github.com/anarazel/postgres/commit/b174a85c446d4190cc89847760427a031711d2a1 commit (fixup to be squashed)]<br />
** DONE: can we avoid waiting for a merge head from a later generation? yes<br />
** DONE: can we avoid using atomics in a signal handler? most uses removed but in flight count remains; could do somethign about this<br />
** DONE: the naming and coding in the baton stuff is weird, needs a rethink -> done, now called "exchange" and shared with iocp<br />
** DONE: kill the array of active IOs if not using aio_suspend (eg FreeBSD)<br />
** DONE: would it be better to have the signal handler give up after a short time so it can get back to doing something useful, and the waiter wake it again after a bit? -> seems to be better, but may need some defences against thundering herds and useless wakeups<br />
<br />
== io_method=iocp ==<br />
<br />
* Acceptance criteria for moving iocp into the main aio branch (from aio-win32):<br />
** "TRAP: FailedAssertion("pgwin32_signal_event != NULL", File: "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\port\open.c", Line: 78, PID: 3416)"<br />
*** pgwin32_open() currently hacked to comment that out because of unresolved ordering problem, read_nondefault_variables() vs pgwin32_signal_initialize()<br />
** currently pgaio_can_scatter_gather() considers only io_data_direct when deciding; but it applies also to WAL I/O (and potentially, in future, who knows, temporary files etc)<br />
*** this matters primarily for Windows because Windows has a different answer depending on use of direct I/O, but we have at least 3 different GUCs to control that on diffrent subsystems<br />
*** one idea would be for pgaio_can_scatter_gather() to take an IO and check the "scb" to see who is asking, or something like that...<br />
*** another idea is to carry a "direct" flag on every IO so that the merge code doesn't have to concern itself with the details beyond that<br />
** new API: pgaio_impl->opening_fd(int fd, int flags) so that Windows impl can register fd with IOCP if flags & O_OVERLAPPED? currently that's all a bit kludgy<br />
* other things<br />
** using GetQueuedCompletionEventEx() requires Windows Vista, but PostgreSQL currently targets XP+. both are long dead, but the case for bumping it needs to be made in the community <br />
** does FileFlushBuffers() have an async cousin? doesn't look like it<br />
*** but there is an equivalen to fdatasync() in ntdll.dll; might be worth looking further<br />
** pgaio_iocp_closing_fd() should drain only IOs on the given fd, not all IOs issued by this backend<br />
*** compare see pgaio_posix_aio_closing_fd() -- it only drains results, it does not reap, to avoid deadlock risk! need something like that here too<br />
<br />
* fixed <br />
** DONE: calling it "iocp" for now ("windows" was too generic; we want to reserve the option to use the new Windows io_uring knockoff API which will probably be a separate method)<br />
** DONE: kill the IOCP thread, and teach pgaio_windows_drain() to drain?<br />
** DONE: solve the resulting deadlock by using the same procsignal trick as posix_aio?<br />
** DONE: we should use GetQueuedCompletionEventEx() to consume multiple completions in one call, instead of a loop!<br />
<br />
== io_method=ioring ==<br />
<br />
* no code yet, just an idea<br />
* ioring (note: no 'u', maybe should be win_ioring or some other name) is a hypothetical future IO method that would use [https://windows-internals.com/i-o-rings-when-one-i-o-operation-is-not-enough/ Windows I/O rings], a knock-off of Linux io_uring that is available in Window 11 preview but still changing<br />
* so far the documentation only describes how to do reads but [https://twitter.com/axboe/status/1435614103314128897 we know that you can already do writes and flushes with the current preview], so there is enough there right now to write aio_ioring.c and hook it up<br />
* the easiest way could be to have one ioring per backend and use aio_exchange.c to deal with cross-process problems (like io_method=posix_aio and io_method=iocp), but it may also be possible to have N iorings that are somehow shared between processes and then use the context system for interlocking (like io_method=io_uring, the Linux one); can you do that, somehow share the handle + memory mapping for the s and c queues + correct wakeups?<br />
<br />
<br />
<br />
= Quick start for PostgreSQL hackers/reviewers =<br />
<br />
* Testing the default mode, simulated AIO using io_method=worker (= the default setting)<br />
** Try strace-ing the backend and IO worker processes to see how I/O syscalls are offloaded<br />
** Adjust the number of io_worker processes with io_workers=N<br />
** See the view pg_stat_aios that shows individual IOs<br />
** See the view pg_stat_aio_backends that shows per-backend info<br />
* Testing the use of direct I/O instead of PostgreSQL's traditional double buffering<br />
** Set io_direct=data to disable OS buffering of relation data<br />
** Set io_direct=wal to disable OS buffering WAL data<br />
** Set io_direct=wal,wal_init,data to disable all<br />
* Testing OS-specific options for "native" AIO<br />
** Linux io_uring<br />
*** install package liburing-dev (or liburing-devel on some distros)<br />
*** configure with --with-liburing (or if using Meson, it should find it by itself?)<br />
*** run with io_method=io_uring<br />
** POSIX AIO, on macOS, FreeBSD, NetBSD, AIX, illumos, Solaris, Linux<br />
*** configure --with-posix-aio<br />
*** run with io_method=posix_aio<br />
*** Some OS specific notes:<br />
**** Linux POSIX AIO is fake, simulated by glibc and musl with threads, and works well enough for testing but wouldn't be a good choice to actually use<br />
**** Solaris/illumos also has fake POSIX AIO (there is a more native/kernel supported AIO API but we don't support that)<br />
**** macOS has very tight limits on AIO; like every other application that uses AIO we're going to have to publish recommentations to crank them up ([http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.dc39123_1254/html/qinstmac/qinstmac11.htm Sybase for example], VirtualBox is another); on the bright side, macOS defaults provide a nice workout for the IO retry code paths; that is, often we try to start an IO and the kernel says EAGAIN, so we way for one IO to complete and then try again, and you can see this in the pg_stat_aio_backends retry counter column<br />
**** AIX only works if you set shared_memory_type=sysv because AIX's AIO can't access memory we allocate with mmap() (otherwise all IO fails with EFAULT)<br />
**** AIX direct I/O might be unnecessarily serializing per-file, which we could fix with O_CONCURRENT, O_CIO or O_CIOR<br />
**** HPUX probably also needs O_CIO to avoid serializing direct IO (native AIO not working there yet but this applies to worker mode too)<br />
** Windows IOCP<br />
*** run with io_method=iocp<br />
*** (not yet pushed to main aio branch, find it in the aio-win32 branch until it's a little more complete...)</div>C2mainhttps://wiki.postgresql.org/index.php?title=Readahead&diff=38560Readahead2024-01-14T14:02:48Z<p>C2main: set category</p>
<hr />
<div>WIP<br />
<br />
= in Linux =<br />
<br />
== Summary ==<br />
<br />
; Very concise summary<br />
: readahead is a buzzword.<br />
<br />
;Concise summary<br />
: readahead in linux is more similar to readv implementation in PostgreSQL (at least the stream IO implemented by Thomas Munro for pg17).<br />
: It might be easier to read it this way to understand how it works.<br />
<br />
;Important points<br />
: readahead is not async<br />
: readahead is soon filesystem specific, see bottom of call tree below.<br />
<br />
On linux, readahead as evolved over time and the last known review has been done by Neil Brown around April 8, 2022.<br/><br />
He wrote an article to summarize what he discovered: [https://lwn.net/Articles/888715/|Readahead: the documentation I wanted to read].<br />
<br />
Which lead to this updated doc (apparently merge in 5.18): https://www.kernel.org/doc/html/latest/core-api/mm-api.html#readahead<br />
<br />
This is indeed a very nice summary and it covers mostly 2 entry functions for readahead: ''page_cache_sync_ra()'' and ''page_cache_async_ra()''.<br />
<br />
Both have a nice up to date documentation about their underlying functions: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=page_cache_async_readahead#c.page_cache_sync_readahead<br />
<br />
----<br />
<br />
Take a moment to read the documentation (excerpt here):<br />
<br />
; page_cache_sync_readahead()<br />
: should be called when a cache miss happened: '''it will submit the read'''. The readahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will improve performance.<br />
<br />
; page_cache_async_readahead()<br />
: should be called when a page is used which is marked as PageReadahead; this is a marker to suggest that the application has used up enough of the readahead window that we should start pulling in more pages.</q><br />
<br />
OK, so far so good.<br />
<br />
But the code when doing ''posix_fadvise'' is distinct, and the entry point is:<br />
<br />
; page_cache_ra_unbounded()<br />
: This function is for filesystems to call when they want to start readahead beyond a file's stated ''i_size''. This is almost certainly not the function you want to call. Use ''page_cache_async_readahead()'' or ''page_cache_sync_readahead()'' instead. File is referenced by caller. Mutexes may be held by caller. '''May sleep, but will not reenter filesystem to reclaim memory.'''<br />
<br />
<br />
A part of the comment is explained in the code: this function apparently does not mark pages as ''PG_readahead'', but the others two before do. This mark on page in linux VM is a hint: if the page is read and was marked as ''PG_readahead'', then the logic is good and linux can keep on doing prefetching.<br />
<br />
Other super important part is: "can wait", yes IT IS NOT ASYNCHRONOUS. The only situation where it's partly asynchronous is when the storage is really congested and so during the execution of the call on the provided readahead range, the process is aborted. As mentioned by Neil the "congestion" has not been followed everywhere and may not work as expected.<br />
<br />
Side note: linux split range in chunks of 2MB to manage memory and reduce locking. Hardcoded.<br />
<br />
== Call tree - The functions really used by linux posix_fadvise ==<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L31<br />
<br />
/*<br />
* POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could<br />
* deactivate the pages and clear PG_Referenced.<br />
*/<br />
int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/internal.h#L126<br />
<br />
inline wrapper<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L306<br />
<br />
/*<br />
* Chunk the readahead into 2 megabyte units, so that we don't pin too much<br />
* memory at once.<br />
*/<br />
void force_page_cache_ra(struct readahead_control *ractl,<br />
unsigned long nr_to_read)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L281<br />
<br />
/*<br />
* do_page_cache_ra() actually reads a chunk of disk. It allocates<br />
* the pages first, then submits them for I/O. This avoids the very bad<br />
* behaviour which would occur if page allocations are causing VM writeback.<br />
* We really don't want to intermingle reads and writes like that.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L205<br />
<br />
/**<br />
* page_cache_ra_unbounded - Start unchecked readahead.<br />
* @ractl: Readahead control.<br />
* @nr_to_read: The number of pages to read.<br />
* @lookahead_size: Where to start the next readahead.<br />
*<br />
* This function is for filesystems to call when they want to start<br />
* readahead beyond a file's stated i_size. This is almost certainly<br />
* not the function you want to call. Use page_cache_async_readahead()<br />
* or page_cache_sync_readahead() instead.<br />
*<br />
* Context: File is referenced by caller. Mutexes may be held by caller.<br />
* May sleep, but will not reenter filesystem to reclaim memory.<br />
*/ <br />
<br />
some interesting comments in it, during preallocation:<br />
<br />
/*<br />
* Partway through the readahead operation, we will have added<br />
* locked pages to the page cache, but will not yet have submitted<br />
* them for I/O. Adding another page may need to allocate memory,<br />
* which can trigger memory reclaim. Telling the VM we're in<br />
* the middle of a filesystem operation will cause it to not<br />
* touch file-backed pages, preventing a deadlock. Most (all?)<br />
* filesystems already specify __GFP_NOFS in their mapping's<br />
* gfp_mask, but let's be explicit here.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L146<br />
<br />
static void read_pages(struct readahead_control *rac)<br />
<br />
Then it is filesystem specific.<br />
<br />
=== EXT4 ===<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L3124<br />
<br />
static int ext4_read_folio(struct file *file, struct folio *folio)<br />
<br />
if not found in mem:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/readpage.c#L211<br />
<br />
int ext4_mpage_readpages(struct inode *inode,<br />
struct readahead_control *rac, struct folio *folio)<br />
<br />
after lot of logic around sequential, hole, ...:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L833<br />
<br />
<br />
/**<br />
* submit_bio - submit a bio to the block device layer for I/O<br />
* @bio: The &struct bio which describes the I/O<br />
*<br />
* submit_bio() is used to submit I/O requests to block devices. It is passed a<br />
* fully set up &struct bio that describes the I/O that needs to be done. The<br />
* bio will be send to the device described by the bi_bdev field.<br />
*<br />
* The success/failure status of the request, along with notification of<br />
* completion, is delivered asynchronously through the ->bi_end_io() callback<br />
* in @bio. The bio must NOT be touched by the caller until ->bi_end_io() has<br />
* been called.<br />
*/<br />
<br />
== First thoughts ==<br />
<br />
* having a WILLNEED on asequential pattern may just compete with linux own ra.<br />
* having a WILLNEED apparently does not set PG_readahead<br />
<br />
* if PG_readahead is set, linux will interpret that as a successfull sequential read ahead pattern and may keep on doing more ra from this block when reading (it's mitigated by some hole detection apparently).<br />
* if PG_readahead is not set when reading, not checked how the ra logic will use that.<br />
* having a DONTNEED just after a read is apparently optimized in linux code.<br />
<br />
* setting RANDOM or SEQUENTIAL flag looks influence the linux default ra effectively. And is costless.<br />
<br />
* it's possible that when pg prefetches pages, it reduces linux ra kicking-in, assuming PG_readahead is not set for such page when read.<br />
<br />
So I wonder if the improved performance is not due to prefetching but because it prevents linux from running readahead as the block visited is in cache but not marked PG_readahead. <br/><br />
It works by helping linux defeating its RA logic ?<br />
<br />
But we can do with thousands less syscall: just open file and advice kernel that it'll be RANDOM, it'll reduces linux readahead logic. <br/><br />
Then flagging SEQUENTIAL when doing a BULKREAD or similar...<br />
<br />
[[Category:development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Readahead&diff=38559Readahead2024-01-14T14:01:31Z<p>C2main: reword the rendering</p>
<hr />
<div>WIP<br />
<br />
= in Linux =<br />
<br />
== Summary ==<br />
<br />
; Very concise summary<br />
: readahead is a buzzword.<br />
<br />
;Concise summary<br />
: readahead in linux is more similar to readv implementation in PostgreSQL (at least the stream IO implemented by Thomas Munro for pg17).<br />
: It might be easier to read it this way to understand how it works.<br />
<br />
;Important points<br />
: readahead is not async<br />
: readahead is soon filesystem specific, see bottom of call tree below.<br />
<br />
On linux, readahead as evolved over time and the last known review has been done by Neil Brown around April 8, 2022.<br/><br />
He wrote an article to summarize what he discovered: [https://lwn.net/Articles/888715/|Readahead: the documentation I wanted to read].<br />
<br />
Which lead to this updated doc (apparently merge in 5.18): https://www.kernel.org/doc/html/latest/core-api/mm-api.html#readahead<br />
<br />
This is indeed a very nice summary and it covers mostly 2 entry functions for readahead: ''page_cache_sync_ra()'' and ''page_cache_async_ra()''.<br />
<br />
Both have a nice up to date documentation about their underlying functions: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=page_cache_async_readahead#c.page_cache_sync_readahead<br />
<br />
----<br />
<br />
Take a moment to read the documentation (excerpt here):<br />
<br />
; page_cache_sync_readahead()<br />
: should be called when a cache miss happened: '''it will submit the read'''. The readahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will improve performance.<br />
<br />
; page_cache_async_readahead()<br />
: should be called when a page is used which is marked as PageReadahead; this is a marker to suggest that the application has used up enough of the readahead window that we should start pulling in more pages.</q><br />
<br />
OK, so far so good.<br />
<br />
But the code when doing ''posix_fadvise'' is distinct, and the entry point is:<br />
<br />
; page_cache_ra_unbounded()<br />
: This function is for filesystems to call when they want to start readahead beyond a file's stated ''i_size''. This is almost certainly not the function you want to call. Use ''page_cache_async_readahead()'' or ''page_cache_sync_readahead()'' instead. File is referenced by caller. Mutexes may be held by caller. '''May sleep, but will not reenter filesystem to reclaim memory.'''<br />
<br />
<br />
A part of the comment is explained in the code: this function apparently does not mark pages as ''PG_readahead'', but the others two before do. This mark on page in linux VM is a hint: if the page is read and was marked as ''PG_readahead'', then the logic is good and linux can keep on doing prefetching.<br />
<br />
Other super important part is: "can wait", yes IT IS NOT ASYNCHRONOUS. The only situation where it's partly asynchronous is when the storage is really congested and so during the execution of the call on the provided readahead range, the process is aborted. As mentioned by Neil the "congestion" has not been followed everywhere and may not work as expected.<br />
<br />
Side note: linux split range in chunks of 2MB to manage memory and reduce locking. Hardcoded.<br />
<br />
== Call tree - The functions really used by linux posix_fadvise ==<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L31<br />
<br />
/*<br />
* POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could<br />
* deactivate the pages and clear PG_Referenced.<br />
*/<br />
int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/internal.h#L126<br />
<br />
inline wrapper<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L306<br />
<br />
/*<br />
* Chunk the readahead into 2 megabyte units, so that we don't pin too much<br />
* memory at once.<br />
*/<br />
void force_page_cache_ra(struct readahead_control *ractl,<br />
unsigned long nr_to_read)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L281<br />
<br />
/*<br />
* do_page_cache_ra() actually reads a chunk of disk. It allocates<br />
* the pages first, then submits them for I/O. This avoids the very bad<br />
* behaviour which would occur if page allocations are causing VM writeback.<br />
* We really don't want to intermingle reads and writes like that.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L205<br />
<br />
/**<br />
* page_cache_ra_unbounded - Start unchecked readahead.<br />
* @ractl: Readahead control.<br />
* @nr_to_read: The number of pages to read.<br />
* @lookahead_size: Where to start the next readahead.<br />
*<br />
* This function is for filesystems to call when they want to start<br />
* readahead beyond a file's stated i_size. This is almost certainly<br />
* not the function you want to call. Use page_cache_async_readahead()<br />
* or page_cache_sync_readahead() instead.<br />
*<br />
* Context: File is referenced by caller. Mutexes may be held by caller.<br />
* May sleep, but will not reenter filesystem to reclaim memory.<br />
*/ <br />
<br />
some interesting comments in it, during preallocation:<br />
<br />
/*<br />
* Partway through the readahead operation, we will have added<br />
* locked pages to the page cache, but will not yet have submitted<br />
* them for I/O. Adding another page may need to allocate memory,<br />
* which can trigger memory reclaim. Telling the VM we're in<br />
* the middle of a filesystem operation will cause it to not<br />
* touch file-backed pages, preventing a deadlock. Most (all?)<br />
* filesystems already specify __GFP_NOFS in their mapping's<br />
* gfp_mask, but let's be explicit here.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L146<br />
<br />
static void read_pages(struct readahead_control *rac)<br />
<br />
Then it is filesystem specific.<br />
<br />
=== EXT4 ===<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L3124<br />
<br />
static int ext4_read_folio(struct file *file, struct folio *folio)<br />
<br />
if not found in mem:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/readpage.c#L211<br />
<br />
int ext4_mpage_readpages(struct inode *inode,<br />
struct readahead_control *rac, struct folio *folio)<br />
<br />
after lot of logic around sequential, hole, ...:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L833<br />
<br />
<br />
/**<br />
* submit_bio - submit a bio to the block device layer for I/O<br />
* @bio: The &struct bio which describes the I/O<br />
*<br />
* submit_bio() is used to submit I/O requests to block devices. It is passed a<br />
* fully set up &struct bio that describes the I/O that needs to be done. The<br />
* bio will be send to the device described by the bi_bdev field.<br />
*<br />
* The success/failure status of the request, along with notification of<br />
* completion, is delivered asynchronously through the ->bi_end_io() callback<br />
* in @bio. The bio must NOT be touched by the caller until ->bi_end_io() has<br />
* been called.<br />
*/<br />
<br />
== First thoughts ==<br />
<br />
* having a WILLNEED on asequential pattern may just compete with linux own ra.<br />
* having a WILLNEED apparently does not set PG_readahead<br />
<br />
* if PG_readahead is set, linux will interpret that as a successfull sequential read ahead pattern and may keep on doing more ra from this block when reading (it's mitigated by some hole detection apparently).<br />
* if PG_readahead is not set when reading, not checked how the ra logic will use that.<br />
* having a DONTNEED just after a read is apparently optimized in linux code.<br />
<br />
* setting RANDOM or SEQUENTIAL flag looks influence the linux default ra effectively. And is costless.<br />
<br />
* it's possible that when pg prefetches pages, it reduces linux ra kicking-in, assuming PG_readahead is not set for such page when read.<br />
<br />
So I wonder if the improved performance is not due to prefetching but because it prevents linux from running readahead as the block visited is in cache but not marked PG_readahead. <br/><br />
It works by helping linux defeating its RA logic ?<br />
<br />
But we can do with thousands less syscall: just open file and advice kernel that it'll be RANDOM, it'll reduces linux readahead logic. <br/><br />
Then flagging SEQUENTIAL when doing a BULKREAD or similar...<br />
<br />
[[category::development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Readahead&diff=38558Readahead2024-01-14T10:59:05Z<p>C2main: add some first thoughts....</p>
<hr />
<div>WIP<br />
<br />
= in Linux =<br />
<br />
Very concise summary: readahead is a buzzword.<br />
<br />
Concise summary: readahead in linux is more similar to readv implementation in PostgreSQL (at least the stream IO implemented by Thomas Munro for pg17). It might be easier to read it this way to understand how it works.<br />
<br />
Important points: readahead is not async, readahead is soon filesystem specific, see bottom of call tree below.<br />
<br />
<br />
On linux, readahead as evolved over time and the last known review has been done by Neil Brown around April 8, 2022.<br />
<br />
He wrote an article to summarize what he discovered: [https://lwn.net/Articles/888715/|Readahead: the documentation I wanted to read].<br />
<br />
Which lead to this updated doc (apparently merge in 5.18):<br />
<br />
https://www.kernel.org/doc/html/latest/core-api/mm-api.html#readahead<br />
<br />
This is indeed a very nice summary and it covers mostly 2 entry functions for readahead: page_cache_sync_ra() and page_cache_async_ra().<br />
<br />
Both have a nice up to date documentation about their underlying functions: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=page_cache_async_readahead#c.page_cache_sync_readahead<br />
<br />
<br />
This is already a BIG things, note about the "sync" version:<br />
<br />
page_cache_sync_readahead() should be called when a cache miss happened: **it will submit the read**. The readahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will improve performance.<br />
<br />
And about the so called "async":<br />
<br />
page_cache_async_readahead() should be called when a page is used which is marked as PageReadahead; this is a marker to suggest that the application has used up enough of the readahead window that we should start pulling in more pages.<br />
<br />
OK, so far so good.<br />
<br />
But the code when doing posix_fadvise is distinct, and the entry point is page_cache_ra_unbounded():<br />
<br />
This function is for filesystems to call when they want to start readahead beyond a file's stated i_size. This is almost certainly not the function you want to call. Use page_cache_async_readahead() or page_cache_sync_readahead() instead.<br />
<br />
File is referenced by caller. Mutexes may be held by caller. **May sleep, but will not reenter filesystem to reclaim memory.**<br />
<br />
<br />
Part of the comment is explained: this function apparently does not mark pages as PG_readahead, but the others two before do. This mark on page in linux VM is a hint: if the page is read and was marked as PG_readahead, then the logic is good and linux can keep on doing prefetching.<br />
<br />
Other super important part is: *can wait*, yes IT IS NOT ASYNCHRONOUS. The only situation where it's partly asynchronous is when the storage is really congested and so during the execution of the call on the provided readahead range, the process is aborted. As mentioned by Neil the "congestion" has not been followed everywhere and may not work as expected.<br />
<br />
<br />
== Call tree ==<br />
<br />
The functions really used by linux posix_fadvise:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L31<br />
<br />
/*<br />
* POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could<br />
* deactivate the pages and clear PG_Referenced.<br />
*/<br />
int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/internal.h#L126<br />
<br />
inline wrapper<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L306<br />
<br />
/*<br />
* Chunk the readahead into 2 megabyte units, so that we don't pin too much<br />
* memory at once.<br />
*/<br />
void force_page_cache_ra(struct readahead_control *ractl,<br />
unsigned long nr_to_read)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L281<br />
<br />
/*<br />
* do_page_cache_ra() actually reads a chunk of disk. It allocates<br />
* the pages first, then submits them for I/O. This avoids the very bad<br />
* behaviour which would occur if page allocations are causing VM writeback.<br />
* We really don't want to intermingle reads and writes like that.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L205<br />
<br />
/**<br />
* page_cache_ra_unbounded - Start unchecked readahead.<br />
* @ractl: Readahead control.<br />
* @nr_to_read: The number of pages to read.<br />
* @lookahead_size: Where to start the next readahead.<br />
*<br />
* This function is for filesystems to call when they want to start<br />
* readahead beyond a file's stated i_size. This is almost certainly<br />
* not the function you want to call. Use page_cache_async_readahead()<br />
* or page_cache_sync_readahead() instead.<br />
*<br />
* Context: File is referenced by caller. Mutexes may be held by caller.<br />
* May sleep, but will not reenter filesystem to reclaim memory.<br />
*/ <br />
<br />
some interesting comments in it:<br />
<br />
during preallocation:<br />
/*<br />
* Partway through the readahead operation, we will have added<br />
* locked pages to the page cache, but will not yet have submitted<br />
* them for I/O. Adding another page may need to allocate memory,<br />
* which can trigger memory reclaim. Telling the VM we're in<br />
* the middle of a filesystem operation will cause it to not<br />
* touch file-backed pages, preventing a deadlock. Most (all?)<br />
* filesystems already specify __GFP_NOFS in their mapping's<br />
* gfp_mask, but let's be explicit here.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L146<br />
<br />
static void read_pages(struct readahead_control *rac)<br />
<br />
Then its filesystem specific.<br />
<br />
EXT4:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L3124<br />
<br />
static int ext4_read_folio(struct file *file, struct folio *folio)<br />
<br />
if not found in mem:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/readpage.c#L211<br />
<br />
int ext4_mpage_readpages(struct inode *inode,<br />
struct readahead_control *rac, struct folio *folio)<br />
<br />
after lot of logic around sequential, hole, ...:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L833<br />
<br />
<br />
/**<br />
* submit_bio - submit a bio to the block device layer for I/O<br />
* @bio: The &struct bio which describes the I/O<br />
*<br />
* submit_bio() is used to submit I/O requests to block devices. It is passed a<br />
* fully set up &struct bio that describes the I/O that needs to be done. The<br />
* bio will be send to the device described by the bi_bdev field.<br />
*<br />
* The success/failure status of the request, along with notification of<br />
* completion, is delivered asynchronously through the ->bi_end_io() callback<br />
* in @bio. The bio must NOT be touched by the caller until ->bi_end_io() has<br />
* been called.<br />
*/<br />
<br />
== First thoughts ==<br />
<br />
* having a WILLNEED on asequential pattern may just compete with linux own ra.<br />
* having a WILLNEED apparently does not set PG_readahead<br />
<br />
* if PG_readahead is set, linux will interpret that as a successfull sequential read ahead pattern and may keep on doing more ra from this block when reading (it's mitigated by some hole detection apparently).<br />
* if PG_readahead is not set when reading, not checked how the ra logic will use that.<br />
* having a DONTNEED just after a read is apparently optimized in linux code.<br />
<br />
* setting RANDOM or SEQUENTIAL flag looks influence the linux default ra effectively. And is costless.<br />
<br />
* it's possible that when pg prefetches pages, it reduces linux ra kicking-in, assuming PG_readahead is not set for such page when read.<br />
<br />
So I wonder if the improved performance is not due to prefetching but because it prevents linux from running readahead as the block visited is in cache but not marked PG_readahead.<br />
It works by helping linux defeating its RA logic ? But we can do with thousands less syscall: just open file and advice kernel that it'll be RANDOM, it'll reduces linux readahead logic. Then flagging SEQUENTIAL when doing a BULKREAD or similar... I did a first test by adding this step in _mdfd_getseg() just after open but I don't have a relevant test scenario yet (I am focused on memory pressured behavior currently, not on performance optimization per se)<br />
<br />
<br />
<br />
[[category::development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Readahead&diff=38557Readahead2024-01-14T10:48:59Z<p>C2main: adding a WIP readahead summary page to help reviewing current prefetching efforts</p>
<hr />
<div>WIP<br />
<br />
= in Linux =<br />
<br />
Very concise summary: readahead is a buzzword.<br />
<br />
Concise summary: readahead in linux is more similar to readv implementation in PostgreSQL (at least the stream IO implemented by Thomas Munro for pg17). It might be easier to read it this way to understand how it works.<br />
<br />
Important points: readahead is not async, readahead is soon filesystem specific, see bottom of call tree below.<br />
<br />
<br />
On linux, readahead as evolved over time and the last known review has been done by Neil Brown around April 8, 2022.<br />
<br />
He wrote an article to summarize what he discovered: [https://lwn.net/Articles/888715/|Readahead: the documentation I wanted to read].<br />
<br />
Which lead to this updated doc (apparently merge in 5.18):<br />
<br />
https://www.kernel.org/doc/html/latest/core-api/mm-api.html#readahead<br />
<br />
This is indeed a very nice summary and it covers mostly 2 entry functions for readahead: page_cache_sync_ra() and page_cache_async_ra().<br />
<br />
Both have a nice up to date documentation about their underlying functions: https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=page_cache_async_readahead#c.page_cache_sync_readahead<br />
<br />
<br />
This is already a BIG things, note about the "sync" version:<br />
<br />
page_cache_sync_readahead() should be called when a cache miss happened: **it will submit the read**. The readahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will improve performance.<br />
<br />
And about the so called "async":<br />
<br />
page_cache_async_readahead() should be called when a page is used which is marked as PageReadahead; this is a marker to suggest that the application has used up enough of the readahead window that we should start pulling in more pages.<br />
<br />
OK, so far so good.<br />
<br />
But the code when doing posix_fadvise is distinct, and the entry point is page_cache_ra_unbounded():<br />
<br />
This function is for filesystems to call when they want to start readahead beyond a file's stated i_size. This is almost certainly not the function you want to call. Use page_cache_async_readahead() or page_cache_sync_readahead() instead.<br />
<br />
File is referenced by caller. Mutexes may be held by caller. **May sleep, but will not reenter filesystem to reclaim memory.**<br />
<br />
<br />
Part of the comment is explained: this function apparently does not mark pages as PG_readahead, but the others two before do. This mark on page in linux VM is a hint: if the page is read and was marked as PG_readahead, then the logic is good and linux can keep on doing prefetching.<br />
<br />
Other super important part is: *can wait*, yes IT IS NOT ASYNCHRONOUS. The only situation where it's partly asynchronous is when the storage is really congested and so during the execution of the call on the provided readahead range, the process is aborted. As mentioned by Neil the "congestion" has not been followed everywhere and may not work as expected.<br />
<br />
<br />
== Call tree ==<br />
<br />
The functions really used by linux posix_fadvise:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L31<br />
<br />
/*<br />
* POSIX_FADV_WILLNEED could set PG_Referenced, and POSIX_FADV_NOREUSE could<br />
* deactivate the pages and clear PG_Referenced.<br />
*/<br />
int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/internal.h#L126<br />
<br />
inline wrapper<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L306<br />
<br />
/*<br />
* Chunk the readahead into 2 megabyte units, so that we don't pin too much<br />
* memory at once.<br />
*/<br />
void force_page_cache_ra(struct readahead_control *ractl,<br />
unsigned long nr_to_read)<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L281<br />
<br />
/*<br />
* do_page_cache_ra() actually reads a chunk of disk. It allocates<br />
* the pages first, then submits them for I/O. This avoids the very bad<br />
* behaviour which would occur if page allocations are causing VM writeback.<br />
* We really don't want to intermingle reads and writes like that.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L205<br />
<br />
/**<br />
* page_cache_ra_unbounded - Start unchecked readahead.<br />
* @ractl: Readahead control.<br />
* @nr_to_read: The number of pages to read.<br />
* @lookahead_size: Where to start the next readahead.<br />
*<br />
* This function is for filesystems to call when they want to start<br />
* readahead beyond a file's stated i_size. This is almost certainly<br />
* not the function you want to call. Use page_cache_async_readahead()<br />
* or page_cache_sync_readahead() instead.<br />
*<br />
* Context: File is referenced by caller. Mutexes may be held by caller.<br />
* May sleep, but will not reenter filesystem to reclaim memory.<br />
*/ <br />
<br />
some interesting comments in it:<br />
<br />
during preallocation:<br />
/*<br />
* Partway through the readahead operation, we will have added<br />
* locked pages to the page cache, but will not yet have submitted<br />
* them for I/O. Adding another page may need to allocate memory,<br />
* which can trigger memory reclaim. Telling the VM we're in<br />
* the middle of a filesystem operation will cause it to not<br />
* touch file-backed pages, preventing a deadlock. Most (all?)<br />
* filesystems already specify __GFP_NOFS in their mapping's<br />
* gfp_mask, but let's be explicit here.<br />
*/<br />
<br />
https://elixir.bootlin.com/linux/latest/source/mm/readahead.c#L146<br />
<br />
static void read_pages(struct readahead_control *rac)<br />
<br />
Then its filesystem specific.<br />
<br />
EXT4:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L3124<br />
<br />
static int ext4_read_folio(struct file *file, struct folio *folio)<br />
<br />
if not found in mem:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/fs/ext4/readpage.c#L211<br />
<br />
int ext4_mpage_readpages(struct inode *inode,<br />
struct readahead_control *rac, struct folio *folio)<br />
<br />
after lot of logic around sequential, hole, ...:<br />
<br />
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L833<br />
<br />
<br />
/**<br />
* submit_bio - submit a bio to the block device layer for I/O<br />
* @bio: The &struct bio which describes the I/O<br />
*<br />
* submit_bio() is used to submit I/O requests to block devices. It is passed a<br />
* fully set up &struct bio that describes the I/O that needs to be done. The<br />
* bio will be send to the device described by the bi_bdev field.<br />
*<br />
* The success/failure status of the request, along with notification of<br />
* completion, is delivered asynchronously through the ->bi_end_io() callback<br />
* in @bio. The bio must NOT be touched by the caller until ->bi_end_io() has<br />
* been called.<br />
*/<br />
<br />
<br />
<br />
<br />
<br />
[[category::development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=User:C2main&diff=38553User:C2main2024-01-04T10:57:52Z<p>C2main: adding garage patches</p>
<hr />
<div>Cédric Villemain is leading [http://data-bene.io Data-Bene], a company focused on PostgreSQL services (support, training, expertise, R&D).<br />
<br />
<br />
Some few old and new ideas, often with patches and tests already:<br />
<br />
* WAL sender lag metric<br />
* synchronous_timeout (suspend sync rep for a moment, at some point)<br />
* Allow empty string on NULL output in ecpg<br />
* Allow to use pg_resgress TAP test in extension makefile (Else pg_regress/pg_prove is not found)<br />
* Allows extensions to install header file<br />
* PSI autothrottling (autvacuum, bgwriter, prefetching) / resources monitor<br />
* fallback plan when conditions reached (for JOIN on badly distributed attributes)<br />
* PostgreSQL standby init and restart via SQL protocol (for spawning new replica from existing pg cluster)</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:PgWiki&diff=38450Category:PgWiki2023-11-28T14:09:45Z<p>C2main: make user account more visible</p>
<hr />
<div><br />
Here are some ways you can contribute and help maintain the wiki:<br />
<br />
# '''Adding Content''': If you have insights, tips, or tutorials related to PostgreSQL, don't hesitate to share them by creating new articles. Your expertise can benefit others in the PostgreSQL community.<br />
# '''Improving Existing Articles''': Enhance the quality of articles by correcting errors, updating outdated information, and providing clearer explanations. Every small improvement can make a big difference.<br />
# '''Defining Categories''': If you come across content that is not properly categorized, take a moment to assign appropriate categories. Well-defined categories make navigation more intuitive for users.<br />
# '''Language Contributions''': If you are fluent in a language not covered in the user documentation, consider translating important articles into that language. This will expand the reach of PostgreSQL knowledge to a wider audience.<br />
# '''Fixing Broken Links''': If you notice broken links or missing references, help keep the wiki reliable by fixing these issues. This ensures that users can access the intended information without frustration.<br />
# '''Adding Language Tags''': If you find articles without language tags for non-English content, add the relevant language tags to make it easier for users to identify the language of the article.<br />
# '''Providing Feedback''': If you have suggestions to improve the structure or organization of the wiki, share your ideas with the community. Your feedback can lead to meaningful enhancements.<br />
<br />
Remember, maintaining the wiki is a collaborative effort, and even small contributions can have a positive impact. By sharing your knowledge and expertise, you help build a robust and helpful resource for PostgreSQL users worldwide.<br />
<br />
= User Accounts =<br />
<br />
To contribute to this wiki, you'll need a [https://www.postgresql.org/account/login/ PostgreSQL community account], which is the same account used for news and event submissions on [http://www.postgresql.org www.postgresql.org].<br />
<br />
'''All activity on this site should follow the [[Policies|PostgreSQL Project Policies]].'''<br />
<br />
'''NOTE''': due to spam activity "editor" privileges are granted manually. If you just created a new community account or if your current account used to have "editor" privileges ask on either the [mailto:pgsql-www@postgresql.org PostgreSQL -www Mailinglist] or the [irc://irc.libera.chat/postgresql PostgreSQL IRC Channel] (direct your request to 'pginfra', multiple individuals in the channel highlight on that string) for help. Please include your community account name in those requests.<br />
<br />
<br />
[[Category:Main]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Ecosystem:PostgreSQL_only:Backup&diff=38449Ecosystem:PostgreSQL only:Backup2023-11-28T13:57:04Z<p>C2main: set eco backup category</p>
<hr />
<div>== Barman - Backup and Recovery Manager for PostgreSQL ==<br />
<br />
* Provider -- 2ndQuadrant<br />
* Website -- https://www.2ndquadrant.com/<br />
* License -- GPL v3<br />
* Interoperability level -- See: http://www.pgbarman.org/<br />
* Verified PostgreSQL versions -- https://www.2ndquadrant.com/en/resources/barman/<br />
* Last update (YYYY-MM-DD) -- 2020-07-09<br />
* Description -- Barman (Backup and Recovery Manager) is an open-source administration tool for disaster recovery of PostgreSQL databases with high business continuity requirements. Barman allows remote backups of multiple servers in business critical environments and helps DBAs during the recovery phase.<br />
* Additional info -- http://www.pgbarman.org/<br />
<br />
== EDB Postgres Backup and Recovery ==<br />
<br />
* Provider -- EnterpriseDB Corporation<br />
* Website -- https://www.enterprisedb.com/product/edb-postgres-platform/edb-postgres-backup-%26-recovery%20bart<br />
* License -- proprietary<br />
* Interoperability level -- explicitly supports PostgreSQL 9.5 or higher<br />
* Verified PostgreSQL versions -- See: https://www.enterprisedb.com/services/edb-supported-products-and-platforms#BART<br />
* Last update (YYYY-MM-DD) -- See: 2017-6-2<br />
* Description -- EDB Postgres Backup and Recovery Tool safeguards local and remote data for disaster recovery by eliminating error prone scripting and manual DBA tasks with simple commands, a CLI, a centrally managed catalog, PITR, retention policies, compression and other automated features making data protection and disaster recovery easy and trouble-free without interrupting ongoing operations.<br />
* Additional info -- [[Ecosystem:EDB Postgres Backup and Recovery|click here]]<br />
<br />
[[Category:Ecosystem:Backup]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Ecosystem:Backup&diff=38448Category:Ecosystem:Backup2023-11-28T13:55:41Z<p>C2main: again ...</p>
<hr />
<div>For those related to backup (and restore)<br />
<br />
<br />
[[Category:Ecosystem]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Ecosystem:Backup&diff=38447Category:Ecosystem:Backup2023-11-28T13:54:55Z<p>C2main: add category to ecosystem</p>
<hr />
<div>For those related to backup (and restore)</div>C2mainhttps://wiki.postgresql.org/index.php?title=Ecosystem:Backup&diff=38446Ecosystem:Backup2023-11-28T13:54:19Z<p>C2main: set eco backup category</p>
<hr />
<div>== Amanda ==<br />
<br />
* Provider -- Zmanda, Inc., University of Maryland at College Park<br />
* Website -- http://www.amanda.org/<br />
* License -- open source (BSD-style, GPLv2) for Community Edition, and proprietary for Enterprise Edition<br />
* Interoperability level -- explicitly supports PostgreSQL 8.x or later<br />
* Verified PostgreSQL versions -- didn't actually run the program, but just checked the documentation<br />
* Last update (YYYY-MM-DD) -- 2018-3-11<br />
* Description -- AMANDA, the Advanced Maryland Automatic Network Disk Archiver, is a backup system that allows the administrator of a LAN to set up a single master backup server to back up multiple hosts to a single large capacity tape or disk drive. Amanda uses native tools (such as GNUtar, dump) for backup and can back up a large number of workstations running multiple versions of Unix/Mac OS X/Linux/Windows.<br />
* Additional info -- [[Ecosystem:Amanda|click here]]<br />
<br />
== Bacula ==<br />
<br />
* Provider -- Kern Sibbald, Bacula Systems S.A.<br />
* Website -- https://www.bacula.org/<br />
* License -- open source (AGPLv3) for Community Edition, and proprietary for Enterprise Edition<br />
* Interoperability level -- explicitly supports PostgreSQL 7.4 or later<br />
* Verified PostgreSQL versions -- didn't actually run the program, but just checked the documentation<br />
* Last update (YYYY-MM-DD) -- 2018-3-11<br />
* Description -- Bacula is a set of Open Source, computer programs that permit you (or the system administrator) to manage backup, recovery, and verification of computer data across a network of computers of different kinds. Bacula is relatively easy to use and very efficient, while offering many advanced storage management features that make it easy to find and recover lost or damaged files. In technical terms, it is an Open Source, network based backup program. According to Source Forge statistics (rank and downloads), Bacula is by far the most popular Open Source backup program.<br />
* Additional info -- [[Ecosystem:Bacula|click here]]<br />
<br />
== Barman - Backup and Recovery Manager for PostgreSQL ==<br />
<br />
* Provider -- 2ndQuadrant<br />
* Website -- https://www.2ndquadrant.com/<br />
* License -- GPL v3<br />
* Interoperability level -- See: http://www.pgbarman.org/<br />
* Verified PostgreSQL versions -- https://www.2ndquadrant.com/en/resources/barman/<br />
* Last update (YYYY-MM-DD) -- 2020-07-09<br />
* Description -- Barman (Backup and Recovery Manager) is an open-source administration tool for disaster recovery of PostgreSQL databases with high business continuity requirements. Barman allows remote backups of multiple servers in business critical environments and helps DBAs during the recovery phase.<br />
* Additional info -- http://www.pgbarman.org/<br />
<br />
== Handy Backup ==<br />
<br />
* Provider -- Novosoft LLC<br />
* Website -- https://www.handybackup.net/<br />
* License -- proprietary<br />
* Interoperability level -- explicitly supports PostgreSQL 9 or later<br />
* Verified PostgreSQL versions -- didn't actually run the program, but just checked the documentation<br />
* Last update (YYYY-MM-DD) -- 2018-3-11<br />
* Description -- Handy Backup is famous for being a Swiss Army Knife of data backup tools. Its functionality covers everything from files-based copying and disk imaging to multi-server backup.<br />
* Additional info -- [[Ecosystem:Handy Backup|click here]]<br />
<br />
== Iperius Backup ==<br />
<br />
* Provider -- Enter Srl<br />
* Website -- https://www.iperiusbackup.com/<br />
* License -- proprietary<br />
* Interoperability level -- explicitly supports PostgreSQL (no specific version is specified)<br />
* Verified PostgreSQL versions -- didn't actually run the program, but just checked the documentation<br />
* Last update (YYYY-MM-DD) -- 2018-3-11<br />
* Description -- Iperius Backup is a backup software for Windows. With the introduction of PostgreSQL backup, Iperius is now one of the best and most complete database backup software in the world: with a single license, the Advanced DB, you can back up unlimited servers and databases Oracle, SQL Server, PostgreSQL, MySQL and MariaDB. Moreover, you’ve other powerful features such as automatic compression, encryption, and automatic copy to any device (NAS, FTP server, Cloud, Google Drive, Amazon S3, etc.).<br />
* Additional info -- [[Ecosystem:Iperius Backup|click here]]<br />
<br />
== NetVault Backup ==<br />
<br />
* Provider -- Quest Software Inc.<br />
* Website -- https://www.quest.com/products/netvault-backup/<br />
* License -- proprietary<br />
* Interoperability level -- explicitly supports PostgreSQL 8.x or later<br />
* Verified PostgreSQL versions -- didn't actually run the program, but just checked the documentation<br />
* Last update (YYYY-MM-DD) -- 2018-3-11<br />
* Description -- Protect data in diverse IT environments - from one intuitive console -- in this scalable backup and recovery solution. NetVault Backup supports multiple server and application platforms in both physical and virtual environments. That means you can ensure availability of business-critical applications, including Oracle, Exchange, MySQL, SQL Server, DB2, and SAP. With NetVault Backup, you can safeguard information stored on network-attached storage (NAS) devices. It also allows you to back up to tape or disk, as well as leverage data deduplication to minimize your storage footprint.<br />
* Additional info -- [[Ecosystem:NetVault Backup|click here]]<br />
<br />
== pg_probackup ==<br />
<br />
* Provider -- Postgres Professional<br />
* Website -- https://github.com/postgrespro/pg_probackup<br />
* License -- PostgreSQL License<br />
* Interoperability level -- >= 9.5 <br />
* Verified PostgreSQL versions -- >= 9.5<br />
* Last update (YYYY-MM-DD) -- 2019-08-13<br />
* Current Version: 2.3.1<br />
* Description -- pg_probackup is a feature-rich and simple to use utility to manage backup and recovery of PostgreSQL database clusters. It is designed to perform periodic backups of the PostgreSQL instance that enable you to restore the server in case of a failure.<br />
* Documentation -- https://postgrespro.github.io/pg_probackup/<br />
<br />
== pgBackRest ==<br />
<br />
* Provider -- David Steele, Crunchy Data<br />
* Website -- https://github.com/pgbackrest/pgbackrest<br />
* License -- MIT License<br />
* Interoperability level -- >= 9.5 <br />
* Verified PostgreSQL versions -- >= 9.5<br />
* Last update (YYYY-MM-DD) -- 2020-10-05<br />
* Current Version: 2.30<br />
* Description -- pgBackRest aims to be a reliable, easy-to-use backup and restore solution that can seamlessly scale up to the largest databases and workloads by utilizing algorithms that are optimized for database-specific requirements.<br />
* Documentation -- https://pgbackrest.org/#getting-started<br />
<br />
== pgmoneta ==<br />
<br />
* Provider --pgmoneta community<br />
* Website -- https://github.com/pgmoneta/pgmoneta<br />
* License -- BSD 3-Clause License<br />
* Interoperability level -- >= 10 <br />
* Verified PostgreSQL versions -- >= 10<br />
* Last update (YYYY-MM-DD) -- 2022-09-22<br />
* Current Version: 0.6.0<br />
* Description -- pgmoneta is a backup / restore solution that supports on-disk encryption, storage engines, monitoring and symlinks.<br />
* Documentation -- https://pgmoneta.github.io/gettingstarted.html<br />
<br />
== Simpana ==<br />
<br />
* Provider -- Commvault<br />
* Website -- https://www.commvault.com/<br />
* License -- proprietary<br />
* Interoperability level -- explicitly supports PostgreSQL 8.x or later<br />
* Verified PostgreSQL versions -- didn't actually run the program, but just checked the documentation<br />
* Last update (YYYY-MM-DD) -- 2018-3-11<br />
* Description -- Simpana software offers seamless and efficient backup and restore of data and information in your enterprise from any operating system, database, and application. Simpana software builds on this foundation by integrating application awareness with hardware snapshots, indexing, global deduplication, replication, search, and reporting, all within a single platform.<br />
* Additional info -- [[Ecosystem:Simpana|click here]]<br />
<br />
== Spectrum Protect ==<br />
<br />
* Provider -- IBM Corporation<br />
* Website -- https://www.ibm.com/us-en/marketplace/data-protection-and-recovery<br />
* License -- proprietary<br />
* Interoperability level -- explicitly supports PostgreSQL (no specific version is specified)<br />
* Verified PostgreSQL versions -- didn't actually run the program, but just checked the documentation<br />
* Last update (YYYY-MM-DD) -- 2018-3-11<br />
* Description -- IBM Spectrum Protect can simplify data protection where data is hosted in physical, virtual, software-defined or cloud environments. With IBM Spectrum Protect, you can choose the right software to manage and protect your data-while also simplifying backup administration, improving efficiencies, delivering scalable capacity and enabling advanced capabilities. With superior virtual machine (VM) protection, IBM Spectrum Protect integrates with IBM Spectrum Protect Plus for fast and easy VM protection with searchable catalog and role-based administration.<br />
* Additional info -- [[Ecosystem:Spectrum Protect|click here]]<br />
<br />
* Provider -- Spictera & IBM<br />
* Website -- https://www-356.ibm.com/partnerworld/gsd/solutiondetails.do?solution=56435&lc=en&stateCd=P&tab=2<br />
* Website -- https://www.suse.com/susePSC/viewVersionPage?versionID=20888<br />
* Website -- https://access.redhat.com/ecosystem/software/4167431<br />
* Website -- http://www.spictera.com<br />
* License -- proprietary<br />
* Interoperability -- generic data protection, supports any PostgreSQL version<br />
* Verified PostgreSQL versions -- PostgreSQL 8/9/10/11<br />
* Last update (YYYY-MM-DD) -- 2019-06-18<br />
* Description -- SPFS is a file system that makes it possible to mount IBM Spectrum Protect filespaces anywhere on your server. All file operations goes directly via the IBM Spectrum Protect Client API to the IBM Spectrum Protect backup server. It is very easy to integrate WAL backups, and one can use the prefered backup methods (pg_dump, pg_basebackup) or any other combinations that the PostgreSQL administrator has preferences of using. Very good data reduction ~75-85% using de-duplication in combination with compression, possible to encrypt data using private keys. Easy to use, requires almost no education.<br />
<br />
== Veritas NetBackup for PostgreSQL Agent ==<br />
<br />
* Provider -- Veritas<br />
* Website -- https://www.enterprisedb.com/blog/veritas-netbackup-and-edb-postgres<br />
* License -- proprietary<br />
* Interoperability level -- explicitly supports PostgreSQL 9.x or later<br />
* Verified PostgreSQL versions -- didn't actually run the program, just checked the documentation<br />
* Last update (YYYY-MM-DD) -- 2019-06-28<br />
* Description -- This is a PostgreSQL specific agent for NetBackup, the enterprise backup and recovery solution. It uses filesystem snapshot technology to take a cohesive backup of configured PG databases, rather than dump to an external file and backing that up. This (at least in theory) should mean backup and recovery are both efficient. The NetBackup documentation shows the agent as supported for Windows and Linux (RHEL, SLES)<br />
* Documentation -- https://www.veritas.com/content/support/en_US/doc/129277259-137906533-0/v129276450-137906533<br />
<br />
[[Category:Ecosystem:Backup]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=91ReleaseSchedule&diff=3844591ReleaseSchedule2023-11-28T13:47:08Z<p>C2main: Undo revision 38444 by C2main (talk)</p>
<hr />
<div>Do not use this page. Use [[91WebTasks]] Instead.<br />
[[Category:DELETEME]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=91ReleaseSchedule&diff=3844491ReleaseSchedule2023-11-28T13:46:50Z<p>C2main: deleted</p>
<hr />
<div></div>C2mainhttps://wiki.postgresql.org/index.php?title=Template:Quotation&diff=38443Template:Quotation2023-11-28T13:44:54Z<p>C2main: set category template</p>
<hr />
<div><blockquote class="toccolours" style="float:none; padding: 10px 15px 10px 15px; display:table;"> {{{1<noinclude>|&nbsp;&nbsp;{{Lorem}}</noinclude>}}}<br />
<br />
{{#if:{{{2|<noinclude>x</noinclude>}}}<br />
|<p style="text-align: right;"> — <cite>{{{2}}}{{#if:{{{3|<noinclude>x</noinclude>}}}<br />
|, ''{{{3}}}''<br />
}}{{#if:{{{4|<noinclude>x</noinclude>}}}<br />
|, {{{4}}}<br />
}}</cite></p><br />
}}</blockquote><noinclude><br />
<br />
{{Documentation}}<br />
<!-- PLEASE ADD CATEGORIES AND INTERWIKIS TO THE /doc SUBPAGE, THANKS --><br />
[[Category:PgWiki:Templates]]<br />
</noinclude></div>C2mainhttps://wiki.postgresql.org/index.php?title=Template:SnippetInfo2&diff=38442Template:SnippetInfo22023-11-28T13:43:41Z<p>C2main: missing noinclude...</p>
<hr />
<div><div style="width: 11em; float: right; border: 1px solid #ccc; padding: 1ex 1em;"><br />
<big style="color: #e65600;"><b>[[:Category:{{{category|}}} Snippets|{{{category|}}} Snippets]]</b></big><br />
<p style="font-weight: bold;">{{{1}}}</p><br />
<div style="background: #eee; text-align: center; padding: 2px;">Works with PostgreSQL</div><br />
{{{version|Any version}}}<br />
<div style="background: #eee; text-align: center; padding: 2px;">Written in</div><br />
{{{lang|&nbsp;}}}<br />
<div style="background: #eee; text-align: center; padding: 2px;">Depends on</div><br />
{{{depends|Nothing}}}<br />
</div><br />
<noinclude><br />
[[Category:PgWiki:Templates]]<br />
</noinclude></div>C2mainhttps://wiki.postgresql.org/index.php?title=Template:SnippetInfo2&diff=38441Template:SnippetInfo22023-11-28T13:42:05Z<p>C2main: set category template</p>
<hr />
<div><div style="width: 11em; float: right; border: 1px solid #ccc; padding: 1ex 1em;"><br />
<big style="color: #e65600;"><b>[[:Category:{{{category|}}} Snippets|{{{category|}}} Snippets]]</b></big><br />
<p style="font-weight: bold;">{{{1}}}</p><br />
<div style="background: #eee; text-align: center; padding: 2px;">Works with PostgreSQL</div><br />
{{{version|Any version}}}<br />
<div style="background: #eee; text-align: center; padding: 2px;">Written in</div><br />
{{{lang|&nbsp;}}}<br />
<div style="background: #eee; text-align: center; padding: 2px;">Depends on</div><br />
{{{depends|Nothing}}}<br />
</div><br />
<br />
[[Category:PgWiki:Templates]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Asking_Questions&diff=38440Category:Asking Questions2023-11-28T13:36:10Z<p>C2main: set end user</p>
<hr />
<div>{{Enduser}} <br />
<br />
Asking Questions</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Bi-Directional_Replication&diff=38439Category:Bi-Directional Replication2023-11-28T13:30:57Z<p>C2main: remove administration category as it is not relevant anymore (only devel pages and page redirects)</p>
<hr />
<div>Bi-directional replication, or BDR, is a PostgreSQL enhancement project that is available as a usable release in its own right.<br />
<br />
See [https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/ Postgres-BDR project home page] for details.<br />
<br />
[[Category:Replication]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:General_articles_and_guides&diff=38438Category:General articles and guides2023-11-28T13:25:37Z<p>C2main: set end user</p>
<hr />
<div>{{Enduser}} <br />
<br />
General articles and guides</div>C2mainhttps://wiki.postgresql.org/index.php?title=Community_Generated_Articles,_Guides,_and_Documentation&diff=38437Community Generated Articles, Guides, and Documentation2023-11-28T13:22:51Z<p>C2main: set end user</p>
<hr />
<div>{{Enduser}} <br />
<br />
=Major article and documentation topic subsections=<br />
* [[Detailed installation guides]]<br />
* [[Converting from other Databases to PostgreSQL]]<br />
* [[Database Administration and Maintenance]]<br />
* [[Development Articles|Development with PostgreSQL]]<br />
* [[Performance Case Study]]<br />
* [[Performance Optimization|PostgreSQL Performance Optimization]]<br />
* [[Frequently Asked Questions]]<br />
* [[Replication, Clustering, and Connection Pooling]]<br />
<br />
=Community Generated Articles=<br />
<br />
== Introductory ==<br />
<br />
* [[Introduction to PostgreSQL Rules - Making entries which can't be altered]] by Justin Clift (2001-07-20)<br />
* [[:Image:SequencesTutorial.swf|A flash tutorial on Sequences]] by Justin Clift<br />
* [[:Image:ReferentialIntegrityTutorial.swf|A flash tutorial on Referential Integrity]] by Justin Clift<br />
<br />
== Other Applications ==<br />
<br />
* [[Ecosystem:PostgreSQL ecosystem|PostgreSQL ecosystem]] - a directory of software and cloud services that support or use PostgreSQL<br />
* [[PostgreSQL Clients]]<br />
* [[Community Guide to PostgreSQL GUI Tools]]<br />
* [[Design Tools]] - GUI tools for ERD schema design<br />
* [http://qmail-sql.digibel.be/ Qmail using PostgreSQL user authentication] by Michael Devogelaere<br />
* HOWTO: [[PostgreSQL and Crystal Reports]] by Mike Godshall (2004-03-24)<br />
* [[:Image:Postgres-Domino6.pdf|Using PostgreSQL for a Domino 6 RDBMS Backend]] (pdf) by Keith Perry<br />
* [[Using Echelon SQL Scripter with PostgreSQL]] by Branden Williams and Matt Springfield<br />
* [[PostgreSQL derived databases]]<br />
* [http://www.postgresonline.com/journal/index.php?/archives/8-Using-OpenOffice-Base-2.3.1-with-PostgreSQL.html Using Open Office Base 2.3.1 with PostgreSQL] Postgres OnLine Journal (January/February 2008)<br />
* [http://wiki.services.openoffice.org/wiki/Base/connectivity/PostgreSQL Using OOo Base to connect to PostgreSQL] - OpenOffice.org Wiki<br />
<br />
== PostgreSQL Usage ==<br />
<br />
* [https://dbconvert.com/blog/postgresql-change-data-capture-cdc/ PostgreSQL Change data capture (CDC) + golang sample code.]<br />
* [[A Brief Real-world Trigger Example]] by Richard Huxton<br />
* [http://www.alberton.info/postgresql_meta_info.html Retrieving Database meta-data from the Information Schema] by Lorenzo Alberton (2006-05-02)<br />
* [http://wiki.ael.be/index.php/PostgresQL101 PostgreSQL 101] by Shridhar Daithankar (2006-05-01)<br />
* FAQ: [[Working with Dates and Times in PostgreSQL]] by Josh Berkus (2001-08-25)<br />
* [[How to avoid overlapping intervals with PostgreSQL]] by Matteo Nastasi (2001-07-16)<br />
* [[COPY]] by Jeff Eckermann (2001-09-17), rewritten (2008-03-02) Evan Carroll<br />
* [http://www.postgresonline.com/journal/archives/177-postgresql90_cheatsheet.html PostgreSQL 9.0 Cheat Sheet] by Postgres OnLine (September/October 2010)<br />
* [http://www.postgresonline.com/journal/index.php?/archives/55-PostgreSQL-8.3-TSearch-Cheat-Sheet-Overview.html PostgreSQL 8.3 TSearch Full-Text Search Cheat Sheet] by Postgres OnLine (May/June 2008)<br />
* [http://www.postgis.us/study_guides PostGIS related study guides] Regina Obe and Leo Hsu<br />
* [[Don't Do This]]<br />
<br />
== Troubleshooting ==<br />
<br />
* [[Generating a stack trace of a PostgreSQL backend]]<br />
<br />
= Deprecated articles =<br />
These are articles that are not really applicable to current versions of PostgreSQL, but are kept here for those still running those old versions:<br />
<br />
* [[Information on the 8.1.4/8.0.8/etc. Security Release]]<br />
* [http://mark.stosberg.com/Tech/postgres/pg-65-7-upgrade.html Tips for upgrading PostgreSQL from 6.5.3 to 7.0.3] by Mark Stosberg<br />
* [http://www.linux-mag.com/id/1573 Overview of new features in PostgreSQL 7.4] by Josh Berkus and Joe Conway (2004-02)<br />
* Replication HOWTO: [[Setting up RServ with PostgreSQL 7.0.3]] by S.Gopi (2001-09-13)<br />
* [http://www.rot13.org/%7Edpavlin/usogres/index.html Usogres Replication Test report] by Dobrica Pavlinusic (2002-04-22)<br />
* HOWTO : [[Full Text Indexing with PostgreSQL]] using contrib/fulltextindex by Christopher Kings-Lynne (2002-04-07)<br />
* [[PostgreSQL for Enterprise Business Applications]] by Ron Chmara (2002-01-29)<br />
<br />
[[Category:General articles and guides|!]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Snippets&diff=38436Category:Snippets2023-11-28T13:18:01Z<p>C2main: remove category documentation</p>
<hr />
<div>{{Enduser}} <br />
<br />
This is a collection of code examples, queries, functions, etc. for PostgreSQL.<br />
See the [[Talk:Snippets|Talk page]] for instructions on adding a new snippet.<br />
<br />
=== See also ===<br />
The [[:Category:Languages|language category]] contains a list of languages that snippets can be written in.<br />
<br />
[http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks This page] contains a bunch more useful "tricks". Some of them can be probably moved to this Wiki; {{messageLink|162867790905180929h4df06843q5a808e1732b6acad@mail.gmail.com|we have permission}} from the author.</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:End-Users&diff=38435Category:End-Users2023-11-28T13:16:57Z<p>C2main: Set this category a top level one.</p>
<hr />
<div>All wiki content for end-users of PostgreSQL.<br />
<br />
[[Category:Main]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:End-Users&diff=38434Category:End-Users2023-11-28T13:16:29Z<p>C2main: For End users of PostgreSQL</p>
<hr />
<div>All wiki content for end-users of PostgreSQL.</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Snippets&diff=38433Category:Snippets2023-11-28T13:15:31Z<p>C2main: set end user</p>
<hr />
<div>{{Enduser}} <br />
<br />
This is a collection of code examples, queries, functions, etc. for PostgreSQL.<br />
See the [[Talk:Snippets|Talk page]] for instructions on adding a new snippet.<br />
<br />
=== See also ===<br />
The [[:Category:Languages|language category]] contains a list of languages that snippets can be written in.<br />
<br />
[http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks This page] contains a bunch more useful "tricks". Some of them can be probably moved to this Wiki; {{messageLink|162867790905180929h4df06843q5a808e1732b6acad@mail.gmail.com|we have permission}} from the author.<br />
<br />
[[Category:Documentation]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Template:Enduser&diff=38432Template:Enduser2023-11-28T13:11:01Z<p>C2main: Add a template to define End-Users category on a page</p>
<hr />
<div><noinclude><br />
<br />
== Description ==<br />
<br />
Define category [[:Category:End-Users]]. <br />
<br />
== Usage ==<br />
<br />
<nowiki>{{</nowiki>'''Enduser'''}}<br />
<br />
== Template ==<br />
<br />
[[Category:Category_templates]]<br />
</noinclude><br />
<br />
<includeonly><br />
[[Category:End-Users]]<br />
</includeonly></div>C2mainhttps://wiki.postgresql.org/index.php?title=WIP_alternative_Main_Page&diff=38431WIP alternative Main Page2023-11-28T12:34:11Z<p>C2main: End-users</p>
<hr />
<div>{{Languages}}<br />
__NOTOC__<br />
<br />
= Welcome to the PostgreSQL Wiki! =<br />
<br />
This wiki is designed to provide a comprehensive resource for end-users of PostgreSQL and the PostgreSQL community, and to discover the PostgreSQL ecosystem.<br />
<br />
== End-users ==<br />
<br />
If you're a user of PostgreSQL, you probably want to explore the following category: [[:Category:End-Users]]<br />
<br />
Do not forget the official PostgreSQL documentation and additional learning resources:<br />
<br />
* [http://www.postgresql.org/docs/manuals/ PostgreSQL Manuals]<br />
* [http://www.postgresql.org/docs/books/ PostgreSQL Books]<br />
* [https://www.postgresql.org/docs/current/tutorial.html PostgreSQL Tutorials]<br />
<br />
== Community ==<br />
<br />
For those interested in contributing to PostgreSQL or staying updated on the community:<br />
<br />
You can connect with PostgreSQL users and developers on [irc://irc.libera.chat/postgresql #postgresql on Libera]. Check out the list of IRC nicknames with real-world names [[IRC2RWNames | here]].<br />
<br />
There is a root category to explore: [[:Category:Community]]<br />
<br />
== PostgreSQL ecosystem ==<br />
<br />
For discovering PostgreSQL related software, tools, extensions, ...<br />
<br />
There is a root category to explore: [[:Category:Ecosystem]]<br />
<br />
== Contributing and Maintaining the Wiki ==<br />
<br />
We encourage users like you to actively contribute to this PostgreSQL Wiki, making it a valuable resource for the community. Your contributions can range from adding new content to improving existing articles, ensuring that users can find the information they need easily.<br />
<br />
There is a root category to explore: [[:Category:PgWiki]]<br />
<br />
== Wiki in other languages ==<br />
<br />
If you're more comfortable with languages other than English, you can find and edit relevant content here:<br />
<br />
* [[Main Page/de|deutsch]]<br />
* [[Main Page/es|español]] <br />
* [[Main Page/fr|français]]<br />
* [[Main Page/he|עברית]]<br />
* [[Main Page/ja|日本語]]<br />
* [[Main Page/pt|português]]<br />
* [[Main Page/ru|русский]]<br />
* [[Main Page/zh|中文]]<br />
* [[Main Page/zh-hant|中文(繁體]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:PgWiki&diff=38430Category:PgWiki2023-11-28T12:14:03Z<p>C2main: Some editors tips</p>
<hr />
<div><br />
Here are some ways you can contribute and help maintain the wiki:<br />
<br />
# '''Adding Content''': If you have insights, tips, or tutorials related to PostgreSQL, don't hesitate to share them by creating new articles. Your expertise can benefit others in the PostgreSQL community.<br />
# '''Improving Existing Articles''': Enhance the quality of articles by correcting errors, updating outdated information, and providing clearer explanations. Every small improvement can make a big difference.<br />
# '''Defining Categories''': If you come across content that is not properly categorized, take a moment to assign appropriate categories. Well-defined categories make navigation more intuitive for users.<br />
# '''Language Contributions''': If you are fluent in a language not covered in the user documentation, consider translating important articles into that language. This will expand the reach of PostgreSQL knowledge to a wider audience.<br />
# '''Fixing Broken Links''': If you notice broken links or missing references, help keep the wiki reliable by fixing these issues. This ensures that users can access the intended information without frustration.<br />
# '''Adding Language Tags''': If you find articles without language tags for non-English content, add the relevant language tags to make it easier for users to identify the language of the article.<br />
# '''Providing Feedback''': If you have suggestions to improve the structure or organization of the wiki, share your ideas with the community. Your feedback can lead to meaningful enhancements.<br />
<br />
Remember, maintaining the wiki is a collaborative effort, and even small contributions can have a positive impact. By sharing your knowledge and expertise, you help build a robust and helpful resource for PostgreSQL users worldwide.<br />
<br />
=== User Accounts ===<br />
<br />
To contribute to this wiki, you'll need a [https://www.postgresql.org/account/login/ PostgreSQL community account], which is the same account used for news and event submissions on [http://www.postgresql.org www.postgresql.org].<br />
<br />
'''All activity on this site should follow the [[Policies|PostgreSQL Project Policies]].'''<br />
<br />
'''NOTE''': due to spam activity "editor" privileges are granted manually. If you just created a new community account or if your current account used to have "editor" privileges ask on either the [mailto:pgsql-www@postgresql.org PostgreSQL -www Mailinglist] or the [irc://irc.libera.chat/postgresql PostgreSQL IRC Channel] (direct your request to 'pginfra', multiple individuals in the channel highlight on that string) for help. Please include your community account name in those requests.<br />
<br />
<br />
[[Category:Main]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=WIP_alternative_Main_Page&diff=38429WIP alternative Main Page2023-11-28T12:13:38Z<p>C2main: Contributing and Maintaining the Wiki moved to pgwiki</p>
<hr />
<div>{{Languages}}<br />
__NOTOC__<br />
<br />
= Welcome to the PostgreSQL Wiki! =<br />
<br />
This wiki is designed to provide a comprehensive resource for end-users of PostgreSQL and the PostgreSQL community, and to discover the PostgreSQL ecosystem.<br />
<br />
== End-users ==<br />
<br />
If you're a user seeking information on using PostgreSQL or understanding its features, start here:<br />
<br />
* [[PostgreSQL_Clients|Partial list of interactive SQL clients]]<br />
* [[:Category:Extensions|PostgreSQL extension pages]]<br />
* [[Frequently Asked Questions]]<br />
* [[Community Generated Articles, Guides, and Documentation|General articles and guides]]<br />
* [[PostgreSQL Tutorials]]<br />
* [[:Category:Snippets|PostgreSQL Related Code Snippets]]<br />
* [[Detailed installation guides]]<br />
* [[Converting from other Databases to PostgreSQL]]<br />
* [[Comparison with other Database systems]]<br />
* [[PostgreSQL Related Slides and Presentations]]<br />
<br />
=== Database Administration ===<br />
<br />
If you're a PostgreSQL Database Administrator (DBA) responsible for managing and maintaining PostgreSQL instances, explore these topics:<br />
<br />
* [[Database Administration and Maintenance]]<br />
* [[Performance Optimization|PostgreSQL Performance Optimization]]<br />
* [[Backup and Recovery]]<br />
* [[High Availability and Replication]]<br />
* [[Security Best Practices]]<br />
<br />
=== Development with PostgreSQL ===<br />
<br />
For developers working on applications that interact with PostgreSQL:<br />
<br />
* [[Development Articles|Development with PostgreSQL]]<br />
* [[Best Practices for Application Development]]<br />
* [[Working with PostgreSQL Extensions]]<br />
* [[Query Optimization Techniques]]<br />
* [[:Category:SQL|SQL snippets]]<br />
<br />
=== Documentation Resources ===<br />
<br />
For official PostgreSQL documentation and additional learning resources:<br />
<br />
* [http://www.postgresql.org/docs/manuals/ PostgreSQL Manuals]<br />
* [http://www.postgresql.org/docs/books/ PostgreSQL Books]<br />
* [https://www.postgresql.org/docs/current/tutorial.html PostgreSQL Tutorials]<br />
<br />
== Community ==<br />
<br />
For those interested in contributing to PostgreSQL or staying updated on the community:<br />
<br />
You can connect with PostgreSQL users and developers on [irc://irc.libera.chat/postgresql #postgresql on Libera]. Check out the list of IRC nicknames with real-world names [[IRC2RWNames | here]].<br />
<br />
There is a root category to explore: [[:Category:Community]]<br />
<br />
== PostgreSQL ecosystem ==<br />
<br />
For discovering PostgreSQL related software, tools, extensions, ...<br />
<br />
There is a root category to explore: [[:Category:Ecosystem]]<br />
<br />
== Contributing and Maintaining the Wiki ==<br />
<br />
We encourage users like you to actively contribute to this PostgreSQL Wiki, making it a valuable resource for the community. Your contributions can range from adding new content to improving existing articles, ensuring that users can find the information they need easily.<br />
<br />
There is a root category to explore: [[:Category:PgWiki]]<br />
<br />
== Wiki in other languages ==<br />
<br />
If you're more comfortable with languages other than English, you can find and edit relevant content here:<br />
<br />
* [[Main Page/de|deutsch]]<br />
* [[Main Page/es|español]] <br />
* [[Main Page/fr|français]]<br />
* [[Main Page/he|עברית]]<br />
* [[Main Page/ja|日本語]]<br />
* [[Main Page/pt|português]]<br />
* [[Main Page/ru|русский]]<br />
* [[Main Page/zh|中文]]<br />
* [[Main Page/zh-hant|中文(繁體]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=WIP_alternative_Main_Page&diff=38428WIP alternative Main Page2023-11-28T12:12:20Z<p>C2main: link to Community</p>
<hr />
<div>{{Languages}}<br />
__NOTOC__<br />
<br />
= Welcome to the PostgreSQL Wiki! =<br />
<br />
This wiki is designed to provide a comprehensive resource for end-users of PostgreSQL and the PostgreSQL community, and to discover the PostgreSQL ecosystem.<br />
<br />
== End-users ==<br />
<br />
If you're a user seeking information on using PostgreSQL or understanding its features, start here:<br />
<br />
* [[PostgreSQL_Clients|Partial list of interactive SQL clients]]<br />
* [[:Category:Extensions|PostgreSQL extension pages]]<br />
* [[Frequently Asked Questions]]<br />
* [[Community Generated Articles, Guides, and Documentation|General articles and guides]]<br />
* [[PostgreSQL Tutorials]]<br />
* [[:Category:Snippets|PostgreSQL Related Code Snippets]]<br />
* [[Detailed installation guides]]<br />
* [[Converting from other Databases to PostgreSQL]]<br />
* [[Comparison with other Database systems]]<br />
* [[PostgreSQL Related Slides and Presentations]]<br />
<br />
=== Database Administration ===<br />
<br />
If you're a PostgreSQL Database Administrator (DBA) responsible for managing and maintaining PostgreSQL instances, explore these topics:<br />
<br />
* [[Database Administration and Maintenance]]<br />
* [[Performance Optimization|PostgreSQL Performance Optimization]]<br />
* [[Backup and Recovery]]<br />
* [[High Availability and Replication]]<br />
* [[Security Best Practices]]<br />
<br />
=== Development with PostgreSQL ===<br />
<br />
For developers working on applications that interact with PostgreSQL:<br />
<br />
* [[Development Articles|Development with PostgreSQL]]<br />
* [[Best Practices for Application Development]]<br />
* [[Working with PostgreSQL Extensions]]<br />
* [[Query Optimization Techniques]]<br />
* [[:Category:SQL|SQL snippets]]<br />
<br />
=== Documentation Resources ===<br />
<br />
For official PostgreSQL documentation and additional learning resources:<br />
<br />
* [http://www.postgresql.org/docs/manuals/ PostgreSQL Manuals]<br />
* [http://www.postgresql.org/docs/books/ PostgreSQL Books]<br />
* [https://www.postgresql.org/docs/current/tutorial.html PostgreSQL Tutorials]<br />
<br />
== Community ==<br />
<br />
For those interested in contributing to PostgreSQL or staying updated on the community:<br />
<br />
You can connect with PostgreSQL users and developers on [irc://irc.libera.chat/postgresql #postgresql on Libera]. Check out the list of IRC nicknames with real-world names [[IRC2RWNames | here]].<br />
<br />
There is a root category to explore: [[:Category:Community]]<br />
<br />
== PostgreSQL ecosystem ==<br />
<br />
For discovering PostgreSQL related software, tools, extensions, ...<br />
<br />
There is a root category to explore: [[:Category:Ecosystem]]<br />
<br />
== Contributing and Maintaining the Wiki ==<br />
<br />
We encourage users like you to actively contribute to this PostgreSQL Wiki, making it a valuable resource for the community. Your contributions can range from adding new content to improving existing articles, ensuring that users can find the information they need easily.<br />
<br />
Here are some ways you can contribute and help maintain the wiki:<br />
<br />
# '''Adding Content''': If you have insights, tips, or tutorials related to PostgreSQL, don't hesitate to share them by creating new articles. Your expertise can benefit others in the PostgreSQL community.<br />
# '''Improving Existing Articles''': Enhance the quality of articles by correcting errors, updating outdated information, and providing clearer explanations. Every small improvement can make a big difference.<br />
# '''Defining Categories''': If you come across content that is not properly categorized, take a moment to assign appropriate categories. Well-defined categories make navigation more intuitive for users.<br />
# '''Language Contributions''': If you are fluent in a language not covered in the user documentation, consider translating important articles into that language. This will expand the reach of PostgreSQL knowledge to a wider audience.<br />
# '''Fixing Broken Links''': If you notice broken links or missing references, help keep the wiki reliable by fixing these issues. This ensures that users can access the intended information without frustration.<br />
# '''Adding Language Tags''': If you find articles without language tags for non-English content, add the relevant language tags to make it easier for users to identify the language of the article.<br />
# '''Providing Feedback''': If you have suggestions to improve the structure or organization of the wiki, share your ideas with the community. Your feedback can lead to meaningful enhancements.<br />
<br />
Remember, maintaining the wiki is a collaborative effort, and even small contributions can have a positive impact. By sharing your knowledge and expertise, you help build a robust and helpful resource for PostgreSQL users worldwide.<br />
<br />
=== User Accounts ===<br />
<br />
To contribute to this wiki, you'll need a [https://www.postgresql.org/account/login/ PostgreSQL community account], which is the same account used for news and event submissions on [http://www.postgresql.org www.postgresql.org].<br />
<br />
'''All activity on this site should follow the [[Policies|PostgreSQL Project Policies]].'''<br />
<br />
'''NOTE''': due to spam activity "editor" privileges are granted manually. If you just created a new community account or if your current account used to have "editor" privileges ask on either the [mailto:pgsql-www@postgresql.org PostgreSQL -www Mailinglist] or the [irc://irc.libera.chat/postgresql PostgreSQL IRC Channel] (direct your request to 'pginfra', multiple individuals in the channel highlight on that string) for help. Please include your community account name in those requests.<br />
<br />
== Wiki in other languages ==<br />
<br />
If you're more comfortable with languages other than English, you can find and edit relevant content here:<br />
<br />
* [[Main Page/de|deutsch]]<br />
* [[Main Page/es|español]] <br />
* [[Main Page/fr|français]]<br />
* [[Main Page/he|עברית]]<br />
* [[Main Page/ja|日本語]]<br />
* [[Main Page/pt|português]]<br />
* [[Main Page/ru|русский]]<br />
* [[Main Page/zh|中文]]<br />
* [[Main Page/zh-hant|中文(繁體]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:PgWiki&diff=38427Category:PgWiki2023-11-28T12:04:54Z<p>C2main: typo...</p>
<hr />
<div>Placeholder for pages of interest for this wiki editors.<br />
<br />
[[Category:Main]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:PgWiki&diff=38426Category:PgWiki2023-11-28T12:03:58Z<p>C2main: Set top level...</p>
<hr />
<div>Placeholder for pages of interest for this wiki editors.<br />
<br />
[[Category::Main]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:PgWiki&diff=38425Category:PgWiki2023-11-28T12:03:04Z<p>C2main: PgWiki category for editors</p>
<hr />
<div>Placeholder for pages of interest for this wiki editors.</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:PgWiki:Templates&diff=38424Category:PgWiki:Templates2023-11-28T12:02:10Z<p>C2main: Moved to PgWiki category</p>
<hr />
<div>Main category for PostgreSQL Templates.<br />
<br />
[http://www.mediawiki.org/wiki/Help:Templates Get help on MediaWiki Templates].<br />
<br />
[[Category:PgWiki]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Help&diff=38423Category:Help2023-11-28T12:01:44Z<p>C2main: Moved to PgWiki category</p>
<hr />
<div>Online Help<br />
<br />
[[Category:PgWiki]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Backup&diff=38422Category:Backup2023-11-28T11:59:00Z<p>C2main: Update content ...</p>
<hr />
<div>Backups are important. People turn up on the mailing list frequently with problems that could've been avoided or solved much more easily with proper backups.<br />
<br />
See [http://www.postgresql.org/docs/current/static/backup.html the manual] for plenty of information on backing up PostgreSQL.<br />
<br />
It's highly recommended that you keep both <code>[http://www.postgresql.org/docs/current/static/backup-dump.html pg_dump]</code> backups and <code>[https://www.postgresql.org/docs/current/app-pgbasebackup.html pg_basebackup]</code>. They can protect you against damage caused by undetected disk or file system corruption, severe admin errors, PostgreSQL bugs (gasp!), etc. <br />
<br />
If up-to-date backups or warm standby are important, [http://www.postgresql.org/docs/current/static/continuous-archiving.html continuous archiving] backups.They can make sure that you have backups up to the moment of a failure in most cases, rather than having to go back to last night's tapes.<br />
<br />
Several open source software exist to manage backups, if possible avoid (re)writing your own and avoid «home made» scripts.<br />
<br />
Backups without proper recovery testing are not backups!<br />
<br />
[[Category:Administration]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Debugging_the_PostgreSQL_grammar_(Bison)&diff=38421Debugging the PostgreSQL grammar (Bison)2023-11-28T11:46:19Z<p>C2main: Set category Development</p>
<hr />
<div>== Shift/reduce conflicts ==<br />
<br />
Postgres development rules forbid shift/reduce conflicts in the main grammar (and the other conflicts bison produces - reduce/reduce conflicts - are even worse). Often if you're making a grammar change, you can introduce a shift/reduce conflict to the grammar that needs to be fixed. As Tom Lane explains here, shift/reduce conflicts can often be fixed by "unfactoring" the grammar a little. (For more context see the original [http://archives.postgresql.org/pgsql-hackers/2009-11/msg00259.php discussion] on the -hackers mailing list.)<br />
<br />
Frequently, when debugging shift/reduce conflicts it's useful to review the gram.output file, which building the parser leaves behind by default. This report can indicate how the ambiguity arises, as discussed here.<br />
<br />
For those interested in a bit more of the theory of parsers, most compiler construction books cover them to some degree, including the venerable if a bit dated [http://www.amazon.com/Compilers-Principles-Techniques-Alfred-Aho/dp/0201100886/ref=sr_1_1?ie=UTF8&s=books&qid=1257563487&sr=8-1 Dragon Book].<br />
<br />
<pre><br />
> > bison -v doesn't show anything useful beyond saying that there is one<br />
> > shift/reduce conflict. The gram.output is 10MB, which doesn't help me<br />
> > much (I'm still trying to make sense of it).<br />
<br />
Well, you need to learn a bit more about bison I think. The complaint<br />
is<br />
<br />
State 1135 conflicts: 1 shift/reduce<br />
<br />
so we look at state 1135, which says:<br />
<br />
state 1135<br />
<br />
241 alter_table_cmd: ADD_P . opt_column columnDef<br />
251 | ADD_P . TableConstraint<br />
<br />
CHECK shift, and go to state 1698<br />
COLUMN shift, and go to state 1742<br />
CONSTRAINT shift, and go to state 1699<br />
EXCLUSION shift, and go to state 1700<br />
FOREIGN shift, and go to state 1701<br />
PRIMARY shift, and go to state 1702<br />
UNIQUE shift, and go to state 1703<br />
<br />
EXCLUSION [reduce using rule 887 (opt_column)]<br />
$default reduce using rule 887 (opt_column)<br />
<br />
TableConstraint go to state 1743<br />
ConstraintElem go to state 1705<br />
opt_column go to state 1744<br />
<br />
This is the state immediately after scanning "ADD" in an ALTER TABLE<br />
command, and what it's unhappy about is that it has two different things<br />
to do if the next token is EXCLUSION. (The dot in the productions<br />
indicates "where we are", and the square brackets mark the unreachable<br />
action.) If you check the other mentioned states it becomes clear that<br />
the state-1700 path leads to deciding that EXCLUSION begins a<br />
TableConstraint, while rule 887 is the "empty" alternative for<br />
opt_column, and is what would have to be done next if EXCLUSION is a<br />
column name beginning a ColumnDef. So the difficulty is that it can't<br />
be sure whether EXCLUSION is a column name without looking one token<br />
past EXCLUSION, but it has to decide whether to eliminate COLUMN before<br />
it can look ahead past EXCLUSION.<br />
<br />
This is a pretty common difficulty with empty-producing productions.<br />
The usual way around it is to eliminate the empty production by making<br />
the calling production a bit more redundant. In this case, we can fix<br />
it by replacing<br />
<br />
alter_table_cmd:<br />
ADD_P opt_column columnDef<br />
<br />
with two productions<br />
<br />
alter_table_cmd:<br />
ADD_P columnDef<br />
| ADD_P COLUMN columnDef<br />
<br />
The reason this fixes it is that now the parser does not have to make<br />
a shift-reduce decision while EXCLUSION is the next token: it's just<br />
going to shift all the time, and it only has to reduce once EXCLUSION<br />
is the current token and it can see the next one as lookahead. (In<br />
which case, it will reduce EXCLUSION to ColId and proceed with the<br />
its-a-ColumnDef path, only if the next token isn't "(" or "USING".)<br />
<br />
Another way to think about it is that we are forcing bison to split<br />
this one state into two, but I find it easier to understand how to<br />
fix the problem by looking for ways to postpone the reduce decision.<br />
</pre><br />
<br />
== Debugging the PostgreSQL parser ==<br />
<br />
More sophisticated analysis may be required where a grammar produces unexpected results (i.e. the parser is built, but parse analysis receives a raw query tree that does not meet expectations in some way). The Bison documentation describes how to enable traces:<br />
<br />
https://www.gnu.org/software/bison/manual/html_node/Enabling-Traces.html<br />
<br />
The best way to get this to work in PostgreSQL seems to be to declare an "extern int base_yydebug" variable within the top of parser.c. It is also necessary to use the %debug directive. The following patch accomplishes this (debug output will appear in stderr):<br />
<br />
<pre><br />
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y<br />
index 0bc8815..fa21a5a 100644<br />
--- a/src/backend/parser/gram.y<br />
+++ b/src/backend/parser/gram.y<br />
@@ -635,6 +635,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);<br />
<br />
<br />
/* Precedence: lowest to highest */<br />
+%debug<br />
%nonassoc SET /* see relation_expr_opt_alias */<br />
%left UNION EXCEPT<br />
%left INTERSECT<br />
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c<br />
index 6632966..84d401c 100644<br />
--- a/src/backend/parser/parser.c<br />
+++ b/src/backend/parser/parser.c<br />
@@ -24,6 +24,7 @@<br />
#include "parser/gramparse.h"<br />
#include "parser/parser.h"<br />
<br />
+extern int base_yydebug;<br />
<br />
/*<br />
* raw_parser<br />
@@ -48,6 +49,8 @@ raw_parser(const char *str)<br />
/* initialize the bison parser */<br />
parser_init(&yyextra);<br />
<br />
+ base_yydebug = 1;<br />
+<br />
/* Parse! */<br />
yyresult = base_yyparse(yyscanner);<br />
<br />
</pre><br />
<br />
This will allow you to see exactly what states the parser enters as your statement is parsed, which, combined with a Bison report, will greatly aid debugging.<br />
<br />
== Other issues ==<br />
<br />
It may also be useful to temporarily #define COPY_PARSE_PLAN_TREES within pg_config_manual.h when at a loss as to why the parser produces unexpected results.<br />
<br />
[[Category:Development]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Ecosystem&diff=38420Category:Ecosystem2023-11-28T11:31:35Z<p>C2main: link to entry page.</p>
<hr />
<div>All about tools, software, products related to, based on or using PostgreSQL.<br />
<br />
Go to [[Ecosystem:PostgreSQL_ecosystem]] for the "entry page".<br />
<br />
[[Category:Main]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Ecosystem&diff=38419Category:Ecosystem2023-11-28T11:29:03Z<p>C2main: Set this category a top level one.</p>
<hr />
<div>All about tools, software, products related to, based on or using PostgreSQL.<br />
<br />
[[Category:Main]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Ecosystem&diff=38418Category:Ecosystem2023-11-28T11:28:23Z<p>C2main: This category looks important....</p>
<hr />
<div>All about tools, software, products related to, based on or using PostgreSQL.</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:Advocacy&diff=38417Category:Advocacy2023-11-28T11:18:22Z<p>C2main: Change top level category for "Community" instead of "Main" as advocacy probably belongs to community efforts.</p>
<hr />
<div>= Advocacy Category =<br />
<br />
This category gathers every pages of advocacy effort.<br />
<br />
You can also check out the sub-categories :<br />
<br />
[[Category:Community]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Template:The_PostgreSQL_Licence&diff=38415Template:The PostgreSQL Licence2023-11-27T13:53:17Z<p>C2main: Add a link to postgresql.org licence page</p>
<hr />
<div><noinclude><br />
<br />
== Description ==<br />
<br />
This template is used for example by uploaded files, to outline they are licensed like PostgreSQL: https://www.postgresql.org/about/licence/<br />
<br />
== Usage ==<br />
<br />
<nowiki>{{</nowiki>'''The PostgreSQL Licence'''}}<br />
<br />
== Template ==<br />
[[Category:PgWiki:Templates]]<br />
</noinclude><br />
Released under the [https://www.postgresql.org/about/licence/ PostgreSQL License], a liberal Open Source license, similar to the BSD or MIT licenses.</div>C2mainhttps://wiki.postgresql.org/index.php?title=Template:The_PostgreSQL_Licence&diff=38414Template:The PostgreSQL Licence2023-11-27T13:52:22Z<p>C2main: Add missing PostgreSQL licence template</p>
<hr />
<div><noinclude><br />
<br />
== Description ==<br />
<br />
This template is used for example by uploaded files, to outline they are licensed like PostgreSQL: https://www.postgresql.org/about/licence/<br />
<br />
== Usage ==<br />
<br />
<nowiki>{{</nowiki>'''The PostgreSQL Licence'''}}<br />
<br />
== Template ==<br />
[[Category:PgWiki:Templates]]<br />
</noinclude><br />
Released under the PostgreSQL License, a liberal Open Source license, similar to the BSD or MIT licenses.</div>C2mainhttps://wiki.postgresql.org/index.php?title=WIP_alternative_Main_Page&diff=38413WIP alternative Main Page2023-11-27T11:37:01Z<p>C2main: Wording</p>
<hr />
<div>{{Languages}}<br />
__NOTOC__<br />
<br />
= Welcome to the PostgreSQL Wiki! =<br />
<br />
This wiki is designed to provide a comprehensive resource for end-users of PostgreSQL and the PostgreSQL community, and to discover the PostgreSQL ecosystem.<br />
<br />
== End-users ==<br />
<br />
If you're a user seeking information on using PostgreSQL or understanding its features, start here:<br />
<br />
* [[PostgreSQL_Clients|Partial list of interactive SQL clients]]<br />
* [[:Category:Extensions|PostgreSQL extension pages]]<br />
* [[Frequently Asked Questions]]<br />
* [[Community Generated Articles, Guides, and Documentation|General articles and guides]]<br />
* [[PostgreSQL Tutorials]]<br />
* [[:Category:Snippets|PostgreSQL Related Code Snippets]]<br />
* [[Detailed installation guides]]<br />
* [[Converting from other Databases to PostgreSQL]]<br />
* [[Comparison with other Database systems]]<br />
* [[PostgreSQL Related Slides and Presentations]]<br />
<br />
=== Database Administration ===<br />
<br />
If you're a PostgreSQL Database Administrator (DBA) responsible for managing and maintaining PostgreSQL instances, explore these topics:<br />
<br />
* [[Database Administration and Maintenance]]<br />
* [[Performance Optimization|PostgreSQL Performance Optimization]]<br />
* [[Backup and Recovery]]<br />
* [[High Availability and Replication]]<br />
* [[Security Best Practices]]<br />
<br />
=== Development with PostgreSQL ===<br />
<br />
For developers working on applications that interact with PostgreSQL:<br />
<br />
* [[Development Articles|Development with PostgreSQL]]<br />
* [[Best Practices for Application Development]]<br />
* [[Working with PostgreSQL Extensions]]<br />
* [[Query Optimization Techniques]]<br />
* [[:Category:SQL|SQL snippets]]<br />
<br />
=== Documentation Resources ===<br />
<br />
For official PostgreSQL documentation and additional learning resources:<br />
<br />
* [http://www.postgresql.org/docs/manuals/ PostgreSQL Manuals]<br />
* [http://www.postgresql.org/docs/books/ PostgreSQL Books]<br />
* [https://www.postgresql.org/docs/current/tutorial.html PostgreSQL Tutorials]<br />
<br />
== Community ==<br />
<br />
For those interested in contributing to PostgreSQL or staying updated on the community:<br />
<br />
* You can connect with PostgreSQL users and developers on [irc://irc.libera.chat/postgresql #postgresql on Libera]. Check out the list of IRC nicknames with real-world names [[IRC2RWNames | here]].<br />
* [[Events | Upcoming Events]]<br />
* [[Development information]]<br />
* [[Advocacy]]<br />
<br />
== PostgreSQL ecosystem ==<br />
<br />
For discovering PostgreSQL related software, tools, extensions, ...<br />
<br />
There is a root category to explore: [[:Category:Ecosystem]]<br />
<br />
== Contributing and Maintaining the Wiki ==<br />
<br />
We encourage users like you to actively contribute to this PostgreSQL Wiki, making it a valuable resource for the community. Your contributions can range from adding new content to improving existing articles, ensuring that users can find the information they need easily.<br />
<br />
Here are some ways you can contribute and help maintain the wiki:<br />
<br />
# '''Adding Content''': If you have insights, tips, or tutorials related to PostgreSQL, don't hesitate to share them by creating new articles. Your expertise can benefit others in the PostgreSQL community.<br />
# '''Improving Existing Articles''': Enhance the quality of articles by correcting errors, updating outdated information, and providing clearer explanations. Every small improvement can make a big difference.<br />
# '''Defining Categories''': If you come across content that is not properly categorized, take a moment to assign appropriate categories. Well-defined categories make navigation more intuitive for users.<br />
# '''Language Contributions''': If you are fluent in a language not covered in the user documentation, consider translating important articles into that language. This will expand the reach of PostgreSQL knowledge to a wider audience.<br />
# '''Fixing Broken Links''': If you notice broken links or missing references, help keep the wiki reliable by fixing these issues. This ensures that users can access the intended information without frustration.<br />
# '''Adding Language Tags''': If you find articles without language tags for non-English content, add the relevant language tags to make it easier for users to identify the language of the article.<br />
# '''Providing Feedback''': If you have suggestions to improve the structure or organization of the wiki, share your ideas with the community. Your feedback can lead to meaningful enhancements.<br />
<br />
Remember, maintaining the wiki is a collaborative effort, and even small contributions can have a positive impact. By sharing your knowledge and expertise, you help build a robust and helpful resource for PostgreSQL users worldwide.<br />
<br />
=== User Accounts ===<br />
<br />
To contribute to this wiki, you'll need a [https://www.postgresql.org/account/login/ PostgreSQL community account], which is the same account used for news and event submissions on [http://www.postgresql.org www.postgresql.org].<br />
<br />
'''All activity on this site should follow the [[Policies|PostgreSQL Project Policies]].'''<br />
<br />
'''NOTE''': due to spam activity "editor" privileges are granted manually. If you just created a new community account or if your current account used to have "editor" privileges ask on either the [mailto:pgsql-www@postgresql.org PostgreSQL -www Mailinglist] or the [irc://irc.libera.chat/postgresql PostgreSQL IRC Channel] (direct your request to 'pginfra', multiple individuals in the channel highlight on that string) for help. Please include your community account name in those requests.<br />
<br />
== Wiki in other languages ==<br />
<br />
If you're more comfortable with languages other than English, you can find and edit relevant content here:<br />
<br />
* [[Main Page/de|deutsch]]<br />
* [[Main Page/es|español]] <br />
* [[Main Page/fr|français]]<br />
* [[Main Page/he|עברית]]<br />
* [[Main Page/ja|日本語]]<br />
* [[Main Page/pt|português]]<br />
* [[Main Page/ru|русский]]<br />
* [[Main Page/zh|中文]]<br />
* [[Main Page/zh-hant|中文(繁體]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=Category:SQL&diff=38412Category:SQL2023-11-27T11:28:43Z<p>C2main: Set snippets category</p>
<hr />
<div>This category contains code snippets implemented in the SQL language.<br />
<br />
== See also ==<br />
[[Snippets]]<br />
<br />
[[Category:Languages]]<br />
[[Category:Snippets]]</div>C2mainhttps://wiki.postgresql.org/index.php?title=WIP_alternative_Main_Page&diff=38411WIP alternative Main Page2023-11-27T11:26:25Z<p>C2main: Add tutorials</p>
<hr />
<div>__NOTOC__<br />
{{Languages}}<br />
= Welcome to the PostgreSQL Wiki! =<br />
<br />
This wiki is designed to provide a comprehensive resource for end-users of PostgreSQL, the PostgreSQL community and the PostgreSQL ecosystem.<br />
<br />
== End-users ==<br />
<br />
If you're a user seeking information on using PostgreSQL or understanding its features, start here:<br />
<br />
* [[PostgreSQL_Clients|Partial list of interactive SQL clients]]<br />
* [[:Category:Extensions|PostgreSQL extension pages]]<br />
* [[Frequently Asked Questions]]<br />
* [[Community Generated Articles, Guides, and Documentation|General articles and guides]]<br />
* [[PostgreSQL Tutorials]]<br />
* [[:Category:Snippets|PostgreSQL Related Code Snippets]]<br />
* [[Detailed installation guides]]<br />
* [[Converting from other Databases to PostgreSQL]]<br />
* [[Comparison with other Database systems]]<br />
* [[PostgreSQL Related Slides and Presentations]]<br />
<br />
=== Database Administration ===<br />
<br />
If you're a PostgreSQL Database Administrator (DBA) responsible for managing and maintaining PostgreSQL instances, explore these topics:<br />
<br />
* [[Database Administration and Maintenance]]<br />
* [[Performance Optimization|PostgreSQL Performance Optimization]]<br />
* [[Backup and Recovery]]<br />
* [[High Availability and Replication]]<br />
* [[Security Best Practices]]<br />
<br />
=== Development with PostgreSQL ===<br />
<br />
For developers working on applications that interact with PostgreSQL:<br />
<br />
* [[Development Articles|Development with PostgreSQL]]<br />
* [[Best Practices for Application Development]]<br />
* [[Working with PostgreSQL Extensions]]<br />
* [[Query Optimization Techniques]]<br />
* [[:Category:SQL|SQL snippets]]<br />
<br />
=== Documentation Resources ===<br />
<br />
For official PostgreSQL documentation and additional learning resources:<br />
<br />
* [http://www.postgresql.org/docs/manuals/ PostgreSQL Manuals]<br />
* [http://www.postgresql.org/docs/books/ PostgreSQL Books]<br />
* [https://www.postgresql.org/docs/current/tutorial.html PostgreSQL Tutorials]<br />
<br />
== Community ==<br />
<br />
For those interested in contributing to PostgreSQL or staying updated on the community:<br />
<br />
* You can connect with PostgreSQL users and developers on [irc://irc.libera.chat/postgresql #postgresql on Libera]. Check out the list of IRC nicknames with real-world names [[IRC2RWNames | here]].<br />
* [[Events | Upcoming Events]]<br />
* [[Development information]]<br />
* [[Advocacy]]<br />
<br />
== PostgreSQL ecosystem ==<br />
<br />
For discovering PostgreSQL related software, tools, extensions, ...<br />
<br />
There is a root category to explore: [[:Category:Ecosystem]]<br />
<br />
== Contributing and Maintaining the Wiki ==<br />
<br />
We encourage users like you to actively contribute to this PostgreSQL Wiki, making it a valuable resource for the community. Your contributions can range from adding new content to improving existing articles, ensuring that users can find the information they need easily.<br />
<br />
Here are some ways you can contribute and help maintain the wiki:<br />
<br />
# '''Adding Content''': If you have insights, tips, or tutorials related to PostgreSQL, don't hesitate to share them by creating new articles. Your expertise can benefit others in the PostgreSQL community.<br />
# '''Improving Existing Articles''': Enhance the quality of articles by correcting errors, updating outdated information, and providing clearer explanations. Every small improvement can make a big difference.<br />
# '''Defining Categories''': If you come across content that is not properly categorized, take a moment to assign appropriate categories. Well-defined categories make navigation more intuitive for users.<br />
# '''Language Contributions''': If you are fluent in a language not covered in the user documentation, consider translating important articles into that language. This will expand the reach of PostgreSQL knowledge to a wider audience.<br />
# '''Fixing Broken Links''': If you notice broken links or missing references, help keep the wiki reliable by fixing these issues. This ensures that users can access the intended information without frustration.<br />
# '''Adding Language Tags''': If you find articles without language tags for non-English content, add the relevant language tags to make it easier for users to identify the language of the article.<br />
# '''Providing Feedback''': If you have suggestions to improve the structure or organization of the wiki, share your ideas with the community. Your feedback can lead to meaningful enhancements.<br />
<br />
Remember, maintaining the wiki is a collaborative effort, and even small contributions can have a positive impact. By sharing your knowledge and expertise, you help build a robust and helpful resource for PostgreSQL users worldwide.<br />
<br />
=== User Accounts ===<br />
<br />
To contribute to this wiki, you'll need a [https://www.postgresql.org/account/login/ PostgreSQL community account], which is the same account used for news and event submissions on [http://www.postgresql.org www.postgresql.org].<br />
<br />
'''All activity on this site should follow the [[Policies|PostgreSQL Project Policies]].'''<br />
<br />
'''NOTE''': due to spam activity "editor" privileges are granted manually. If you just created a new community account or if your current account used to have "editor" privileges ask on either the [mailto:pgsql-www@postgresql.org PostgreSQL -www Mailinglist] or the [irc://irc.libera.chat/postgresql PostgreSQL IRC Channel] (direct your request to 'pginfra', multiple individuals in the channel highlight on that string) for help. Please include your community account name in those requests.<br />
<br />
== Wiki in other languages ==<br />
<br />
If you're more comfortable with languages other than English, you can find and edit relevant content here:<br />
<br />
* [[Main Page/de|deutsch]]<br />
* [[Main Page/es|español]] <br />
* [[Main Page/fr|français]]<br />
* [[Main Page/he|עברית]]<br />
* [[Main Page/ja|日本語]]<br />
* [[Main Page/pt|português]]<br />
* [[Main Page/ru|русский]]<br />
* [[Main Page/zh|中文]]<br />
* [[Main Page/zh-hant|中文(繁體]]</div>C2main