https://wiki.postgresql.org/api.php?action=feedcontributions&user=Brick&feedformat=atomPostgreSQL wiki - User contributions [en]2024-03-29T07:15:00ZUser contributionsMediaWiki 1.35.13https://wiki.postgresql.org/index.php?title=Tuning_Your_PostgreSQL_Server&diff=23167Tuning Your PostgreSQL Server2014-09-17T15:45:16Z<p>Brick: Editorializing. Removed maintenance_work_mem description as it was inaccurate and anecdotal.</p>
<hr />
<div>__NOTOC__<br />
''by Greg Smith, Robert Treat, and Christopher Browne''<br />
<br />
{{Languages}}<br />
<br />
PostgreSQL ships with a basic configuration tuned for wide compatibility rather than performance. Odds are good the default parameters are very undersized for your system. Rather than get dragged into the details of everything you should eventually know (which you can find if you want it at the [http://www.pgcon.org/2008/schedule/events/104.en.html GUC Three Hour Tour]), here we're going to sprint through a simplified view of the basics, with a look at the most common things people new to PostgreSQL aren't aware of. You should click on the name of the parameter in each section to jump to the relevant documentation in the PostgreSQL manual for more details after reading the quick intro here. There is also additional information available about many of these parameters, as well as a list of parameters you shouldn't adjust, at [https://www.packtpub.com/article/server-configuration-tuning-postgresql Server Configuration Tuning].<br />
<br />
== Background Information on Configuration Settings ==<br />
<br />
PostgreSQL settings can be manipulated a number of different ways, but generally you will want to update them in your postgresql.conf file. The specific options available change from release to release, the definitive list is in the source code at src/backend/utils/misc/guc.c for your version of PostgreSQL (but the pg_settings view works well enough for most purposes).<br />
<br />
=== The types of settings ===<br />
<br />
There are several different types of configuration settings, divided up based on the possible inputs they take<br />
<br />
* Boolean: true, false, on, off<br />
* Integer: Whole numbers (2112)<br />
* Float: Decimal values (21.12)<br />
* Memory / Disk: Integers (2112) or "computer units" (512MB, 2112GB). Avoid integers--you need to know the underlying unit to figure out what they mean.<br />
* Time: "Time units" aka d,m,s (30s). Sometimes the unit is left out; don't do that<br />
* Strings: Single quoted text ('pg_log')<br />
* ENUMs: Strings, but from a specific list ('WARNING', 'ERROR')<br />
* Lists: A comma separated list of strings ('"$user",public,tsearch2) <br />
<br />
=== When they take effect ===<br />
<br />
PostgreSQL settings have different levels of flexibility for when they can be changed, usually related to internal code restrictions. The complete list of levels is:<br />
<br />
* Postmaster: requires restart of server <br />
* Sighup: requires a HUP of the server, either by kill -HUP (usually -1), pg_ctl reload, or select pg_reload_conf();<br />
* User: can be set within individual sessions, take effect only within that session<br />
* Internal: set at compile time, can't be changed, mainly for reference<br />
* Backend: settings which must be set before session start<br />
* Superuser: can be set at runtime for the server by superusers<br />
<br />
Most of the time you'll only use the first of these, but the second can be useful if you have a server you don't want to take down, while the user session settings can be helpful for some special situations. You can tell which type of parameter a setting is by looking at the "context" field in the pg_settings view.<br />
<br />
=== Important notes about postgresql.conf ===<br />
<br />
* You should be able to find it at $PGDATA/postgresql.conf; watch out for symbolic links and other trickiness<br />
* You can figure out the file location with ''SHOW config_file''<br />
* Lines with # are comments and have no effect. For a new database, this will mean the setting is using the default, but on running systems this may not hold true! Changes to the postgresql.conf do not take effect without a reload/restart, so it's possible for the system to be running something different from what is in the file. <br />
* If the same setting is listed multiple times, the last one wins.<br />
<br />
=== Viewing the current settings === <br />
<br />
* Look in postgresql.conf. This works if you follow good practice, but it's not definitive!<br />
* ''show all'', ''show <setting>'' will show you the current value of the setting. Watch out for session specific changes<br />
* ''select * from pg_settings'' will label session specific changes as locally modified<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-connection.html#GUC-LISTEN-ADDRESSES listen_addresses] ==<br />
<br />
By default, PostgreSQL only responds to connections from the local host. If you want your server to be accessible from other systems via standard TCP/IP networking, you need to change listen_addresses from its default. The usual approach is to set it to listen to all addresses like this:<br />
<br />
<code><pre><br />
listen_addresses = '*'<br />
</pre></code><br />
<br />
And then control who can and can connect via the [http://www.postgresql.org/docs/current/static/auth-pg-hba-conf.html pg_hba.conf] file.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-connection.html#GUC-MAX-CONNECTIONS max_connections]==<br />
<br />
max_connections sets exactly that: the maximum number of client connections allowed. This is very important to some of the below parameters (particularly work_mem) because there are some memory resources that are or can be allocated on a per-client basis, so the maximum number of clients suggests the maximum possible memory use. Generally, PostgreSQL on good hardware can support a few hundred connections. If you want to have thousands instead, you should consider using [[Replication, Clustering, and Connection Pooling|connection pooling software]] to reduce the connection overhead.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-SHARED-BUFFERS shared_buffers]==<br />
<br />
The shared_buffers configuration parameter determines how much memory is dedicated to PostgreSQL to use for caching data. One reason the defaults are low is because on some platforms (like older Solaris versions and SGI), having large values requires invasive action like recompiling the kernel. Even on a modern Linux system, the stock kernel will likely not allow setting shared_buffers to over 32MB without adjusting kernel settings first. (PostgreSQL 9.3 and later use a different shared memory mechanism, so kernel settings will usually not have to be adjusted there.)<br />
<br />
If you have a system with 1GB or more of RAM, a reasonable starting value for shared_buffers is 1/4 of the memory in your system. If you have less RAM you'll have to account more carefully for how much RAM the OS is taking up; closer to 15% is more typical there. There are some workloads where even larger settings for shared_buffers are effective, but given the way PostgreSQL also relies on the operating system cache, it's unlikely you'll find using more than 40% of RAM to work better than a smaller amount.<br />
<br />
Be aware that if your system or PostgreSQL build is 32-bit, it might not be practical to set shared_buffers above 2 ~ 2.5GB. See [http://rhaas.blogspot.jp/2011/05/sharedbuffers-on-32-bit-systems.html this blog post] for details.<br />
<br />
Note that on Windows, large values for shared_buffers aren't as effective, and you may find better results keeping it relatively low and using the OS cache more instead. On Windows the useful range is 64MB to 512MB.<br />
<br />
If you are running PostgreSQL 9.2 or earlier, it's likely you will have to increase the amount of memory your operating system allows you to allocate at once to set the value for shared_buffers this high. On UNIX-like systems, if you set it above what's supported, you'll get a message like this:<br />
<br />
<code><pre><br />
IpcMemoryCreate: shmget(key=5432001, size=415776768, 03600) failed: Invalid argument <br />
<br />
This error usually means that PostgreSQL's request for a shared memory <br />
segment exceeded your kernel's SHMMAX parameter. You can either <br />
reduce the request size or reconfigure the kernel with larger SHMMAX. <br />
To reduce the request size (currently 415776768 bytes), reduce <br />
PostgreSQL's shared_buffers parameter (currently 50000) and/or <br />
its max_connections parameter (currently 12).<br />
</pre></code><br />
<br />
See [http://www.postgresql.org/docs/current/static/kernel-resources.html Managing Kernel Resources] for details on how to correct this.<br />
<br />
Changing this setting requires restarting the database. Also, this is a hard allocation of memory; the whole thing gets allocated out of virtual memory when the database starts.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-EFFECTIVE-CACHE-SIZE effective_cache_size]==<br />
<br />
effective_cache_size should be set to an estimate of how much memory is available for disk caching by the operating system and within the database itself, after taking into account what's used by the OS itself and other applications. This is a guideline for how much memory you expect to be available in the OS and PostgreSQL buffer caches, not an allocation! This value is used only by the PostgreSQL query planner to figure out whether plans it's considering would be expected to fit in RAM or not. If it's set too low, indexes may not be used for executing queries the way you'd expect. The setting for shared_buffers is not taken into account here--only the effective_cache_size value is, so it should include memory dedicated to the database too.<br />
<br />
Setting effective_cache_size to 1/2 of total memory would be a normal conservative setting, and 3/4 of memory is a more aggressive but still reasonable amount. You might find a better estimate by looking at your operating system's statistics. On UNIX-like systems, add the free+cached numbers from free or top to get an estimate. On Windows see the "System Cache" size in the Windows Task Manager's Performance tab. Changing this setting does not require restarting the database (HUP is enough).<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-CHECKPOINTS checkpoint_segments checkpoint_completion_target]==<br />
<br />
PostgreSQL writes new transactions to the database in files called WAL segments that are 16MB in size. Every time checkpoint_segments worth of these files have been written, by default 3, a checkpoint occurs. Checkpoints can be resource intensive, and on a modern system doing one every 48MB will be a serious performance bottleneck. Setting checkpoint_segments to a much larger value improves that. Unless you're running on a very small configuration, you'll almost certainly be better setting this to at least 10, which also allows usefully increasing the completion target.<br />
<br />
For more write-heavy systems, values from 32 (checkpoint every 512MB) to 256 (every 4GB) are popular nowadays. Very large settings use a lot more disk and will cause your database to take longer to recover, so make sure you're comfortable with both those things before large increases. Normally the large settings (>64/1GB) are only used for bulk loading. Note that whatever you choose for the segments, you'll still get a checkpoint at least every 5 minutes unless you also increase checkpoint_timeout (which isn't necessary on most systems).<br />
<br />
Checkpoint writes are spread out a bit while the system starts working toward the next checkpoint. You can spread those writes out further, lowering the average write overhead, by increasing the checkpoint_completion_target parameter to its useful maximum of 0.9 (aim to finish by the time 90% of the next checkpoint is here) rather than the default of 0.5 (aim to finish when the next one is 50% done). A setting of 0 gives something similar to the behavior of obsolete versions. The main reason the default isn't just 0.9 is that you need a larger checkpoint_segments value than the default for broader spreading to work well. For lots more information on checkpoint tuning, see [http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm Checkpoints and the Background Writer] (where you'll also learn why tuning the background writer parameters is challenging to do usefully).<br />
<br />
==[http://www.postgresql.org/docs/current/static/routine-vacuuming.html#AUTOVACUUM autovacuum]==<br />
<br />
The autovacuum process takes care of several maintenance chores inside your database that you really need. Generally, if you think you need to turn regular vacuuming off because it's taking too much time or resources, that means you're doing it wrong. The answer to almost all vacuuming problems is to vacuum more often, not less, so that each individual vacuum operation has less to clean up.<br />
<br />
However, it's acceptable to disable autovacuum for short periods of time, for instance when bulk loading large amounts of data.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-logging.html logging]==<br />
There are many things you can log that may or may not be important to you. You should investigate the documentation on all of the options, but here are some tips & tricks to get you started:<br />
<br />
*pgFouine is a tool used to analyze postgresql logs for performance tuning. If you plan to use this tool, it has specific logging requirements. Please check http://pgfouine.projects.postgresql.org/<br />
<br />
*pgFouine has been obsoleted by [http://dalibo.github.com/pgbadger PgBadger]<br />
<br />
*[https://github.com/darold/pgcluu PgCluu] is an handy tool from the author of PgBadger, and is a PostgreSQL performances monitoring and auditing tool.<br />
<br />
*log_destination & log_directory (& log_filename): What you set these options to is not as important as knowing they can give you hints to determine where your database server is logging to. Best practice would be to try and make this as similar as possible across your servers. Note that in some cases, the init script starting your database may be customizing the log destination in the command line used to start the database, overriding what's in the postgresql.conf (and making it so you'll get different behavior if you run pg_ctl manually instead of using the init script).<br />
<br />
*log_min_error_statement: You should probably make sure this is at least on error, so that you will see any SQL commands which cause an error. should be the default on recent versions. <br />
<br />
*log_min_duration_statement: Not necessary for everyday use, but this can generate [[Logging Difficult Queries|logs of "slow queries"]] on your system. <br />
<br />
*log_line_prefix: Appends information to the start of each line. A good generic recommendation is '%t:%r:%u@%d:[%p]: ' : %t=timestamp, %u=db user name, %r=host connecting from, %d=database connecting to, %p=PID of connection. It may not be obvious what the PID is useful at first, but it can be vital for trying to troubleshoot problems in the future so better to put in the logs from the start.<br />
<br />
*log_statement: Choices of none, ddl, mod, all. Using all in production leads to severe performance penalties. DDL can sometime be helpful to discover rogue changes made outside of your recommend processes, by "cowboy DBAs" for example.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET default_statistics_target]==<br />
<br />
The database software collects statistics about each of the tables in your database to decide how to execute queries against it. If you're not getting good execution query plans particularly on larger (or more varied) tables you should increase default_statistics_target then ANALYZE the database again (or wait for autovacuum to do it for you). <br />
<br />
;PostgreSQL 8.4 and later<br />
<br />
The starting default_statistics_target value was raised from 10 to 100 in PostgreSQL 8.4. Increases beyond 100 may still be useful, but this increase makes for greatly improved statistics estimation in the default configuration. The maximum value for the parameter was also increased from 1000 to 10,000 in 8.4.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-WORK-MEM work_mem]==<br />
<br />
If you do a lot of complex sorts, and have a lot of memory, then increasing the <code>work_mem</code> parameter allows PostgreSQL to do larger in-memory sorts which, unsurprisingly, will be faster than disk-based equivalents.<br />
<br />
This size is applied to each and every sort done by each user, and complex queries can use multiple working memory sort buffers. Set it to 50MB, and have 30 users submitting queries, and you are soon using 1.5GB of real memory. Furthermore, if a query involves doing merge sorts of 8 tables, that requires 8 times work_mem. You need to consider what you set max_connections to in order to size this parameter correctly. This is a setting where data warehouse systems, where users are submitting very large queries, can readily make use of many gigabytes of memory.<br />
<br />
[http://www.postgresql.org/docs/9.3/static/runtime-config-logging.html#GUC-LOG-TEMP-FILES log_temp_files] can be used to log sorts, hashes, and temp files which can be useful in figuring out if sorts are spilling to disk instead of fitting in memory. You can see sorts spilling to disk using <code>EXPLAIN ANALYZE</code> plans as well. For example, if you see a line like <code>Sort Method: external merge Disk: 7526kB</code> in the output of EXPLAIN ANALYZE, a <code>work_mem</code> of at least 8MB would keep the intermediate data in memory and likely improve the query response time.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM maintenance_work_mem]==<br />
<br />
Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. It defaults to 16 megabytes (16MB). Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-WAL-SYNC-METHOD wal_sync_method wal_buffers]==<br />
<br />
After every transaction, PostgreSQL forces a commit to disk out to its write-ahead log. This can be done a couple of ways, and on some platforms the other options are considerably faster than the conservative default. open_sync is the most common non-default setting switched to, on platforms that support it but default to one of the fsync methods. See [http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm Tuning PostgreSQL WAL Synchronization] for a lot of background on this topic. Note that open_sync writing is buggy on some platforms (such as [http://lwn.net/Articles/350219/ Linux]), and you should (as always) do plenty of tests under a heavy write load to make sure that you haven't made your system less stable with this change. [[Reliable Writes]] contains more information on this topic.<br />
<br />
Linux kernels starting with version 2.6.33 will cause earlier versions of PostgreSQL to default to wal_sync_method=open_datasync; before that kernel release the default picked was always fdatasync. This can cause a significant performance decrease when combined with small writes and/or small values for wal_buffers.<br />
<br />
Increasing wal_buffers from its tiny default of a small number of kilobytes is helpful for write-heavy systems. Benchmarking generally suggests that just increasing to 1MB is enough for some large systems, and given the amount of RAM in modern servers allocating a full WAL segment (16MB, the useful upper-limit here) is reasonable. Changing wal_buffers requires a database restart.<br />
<br />
;PostgreSQL 9.1 and later<br />
<br />
Starting with PostgreSQL 9.1 wal_buffers defaults to being 1/32 of the size of shared_buffers, with an upper limit of 16MB (reached when shared_buffers=512MB).<br />
<br />
PostgreSQL 9.1 also changes the logic for selecting the default wal_sync_method such that on newer Linux kernels, it will still select fdatasync as its method--the same as on older Linux versions.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-CONSTRAINT-EXCLUSION constraint_exclusion]==<br />
<br />
<tt>constraint_exclusion</tt> now defaults to a new choice: <tt>partition</tt>. This will only enable constraint exclusion for partitioned tables which is the right thing to do in nearly all cases.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-MAX-PREPARED-TRANSACTIONS max_prepared_transactions]==<br />
<br />
This setting is used for managing 2 phase commit. If you do not use two phase commit (and if you don't know what it is, you don't use it), then you can set this value to 0. That will save a little bit of shared memory. For database systems with a large number (at least hundreds) of concurrent connections, be aware that this setting also affects the number of available lock-slots in pg_locks, so you may want to leave it at the default setting. There is a formula for how much memory gets allocated [http://www.postgresql.org/docs/current/static/kernel-resources.html#SHARED-MEMORY-PARAMETERS in the docs] and in the default postgresql.conf.<br />
<br />
Changing max_prepared_transactions requires a server restart.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT synchronous_commit]==<br />
PostgreSQL can only safely use a write cache if it has a battery backup. See [http://www.postgresql.org/docs/current/static/wal-reliability.html WAL reliability] for an essential introduction to this topic. No, really; go read that right now, it's vital to understand that if you want your database to work right.<br />
<br />
You may be limited to approximately 100 transaction commits per second per client in situations where you don't have such a durable write cache (and perhaps only 500/second even with lots of clients).<br />
<br />
For situations where a small amount of data loss is acceptable in return for a large boost in how many updates you can do to the database per second, consider switching synchronous commit off. This is particularly useful in the situation where you do not have a battery-backed write cache on your disk controller, because you could potentially get thousands of commits per second instead of just a few hundred.<br />
<br />
For obsolete versions of PostgreSQL, you may find people recommending that you set ''fsync=off'' to speed up writes on busy systems. This is dangerous--a power loss could result in your database getting corrupted and not able to start again. Synchronous commit doesn't introduce the risk of ''corruption'', which is really bad, just some risk of data ''loss''.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-RANDOM-PAGE-COST random_page_cost]==<br />
This setting suggests to the optimizer how long it will take your disks to seek to a random disk page, as a multiple of how long a sequential read (with a cost of 1.0) takes. If you have particularly fast disks, as commonly found with RAID arrays of SCSI disks, it may be appropriate to lower random_page_cost, which will encourage the query optimizer to use random access index scans. Some feel that 4.0 is always too large on current hardware; it's not unusual for administrators to standardize on always setting this between 2.0 and 3.0 instead. In some cases that behavior is a holdover from earlier PostgreSQL versions where having random_page_cost too high was more likely to screw up plan optimization than it is now (and setting at or below 2.0 was regularlly necessary). Since these cost estimates are just that--estimates--it shouldn't hurt to try lower values.<br />
<br />
But this not where you should start to search for plan problems. Note that random_page_cost is pretty far down this list (at the end in fact). If you are getting bad plans, this shouldn't be the first thing you look at, even though lowering this value may be effective. Instead, you should start by making sure autovacuum is working properly, that you are collecting enough statistics, and that you have correctly sized the memory parameters for your server--all the things gone over above. After you've done all those much more important things, if you're still getting bad plans ''then'' you should see if lowering random_page_cost is still useful.<br />
<br />
[[Category:Administration]] [[Category:Performance]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Tuning_Your_PostgreSQL_Server&diff=21901Tuning Your PostgreSQL Server2014-03-02T02:15:06Z<p>Brick: Updated links and description of PgBadger and PgCluu.</p>
<hr />
<div>__NOTOC__<br />
''by Greg Smith, Robert Treat, and Christopher Browne''<br />
<br />
{{Languages}}<br />
<br />
PostgreSQL ships with a basic configuration tuned for wide compatibility rather than performance. Odds are good the default parameters are very undersized for your system. Rather than get dragged into the details of everything you should eventually know (which you can find if you want it at the [http://www.pgcon.org/2008/schedule/events/104.en.html GUC Three Hour Tour]), here we're going to sprint through a simplified view of the basics, with a look at the most common things people new to PostgreSQL aren't aware of. You should click on the name of the parameter in each section to jump to the relevant documentation in the PostgreSQL manual for more details after reading the quick intro here. There is also additional information available about many of these parameters, as well as a list of parameters you shouldn't adjust, at [https://www.packtpub.com/article/server-configuration-tuning-postgresql Server Configuration Tuning].<br />
<br />
== Background Information on Configuration Settings ==<br />
<br />
PostgreSQL settings can be manipulated a number of different ways, but generally you will want to update them in your postgresql.conf file. The specific options available change from release to release, the definitive list is in the source code at src/backend/utils/misc/guc.c for your version of PostgreSQL (but the pg_settings view works well enough for most purposes).<br />
<br />
=== The types of settings ===<br />
<br />
There are several different types of configuration settings, divided up based on the possible inputs they take<br />
<br />
* Boolean: true, false, on, off<br />
* Integer: Whole numbers (2112)<br />
* Float: Decimal values (21.12)<br />
* Memory / Disk: Integers (2112) or "computer units" (512MB, 2112GB). Avoid integers--you need to know the underlying unit to figure out what they mean.<br />
* Time: "Time units" aka d,m,s (30s). Sometimes the unit is left out; don't do that<br />
* Strings: Single quoted text ('pg_log')<br />
* ENUMs: Strings, but from a specific list ('WARNING', 'ERROR')<br />
* Lists: A comma separated list of strings ('"$user",public,tsearch2) <br />
<br />
=== When they take effect ===<br />
<br />
PostgreSQL settings have different levels of flexibility for when they can be changed, usually related to internal code restrictions. The complete list of levels is:<br />
<br />
* Postmaster: requires restart of server <br />
* Sighup: requires a HUP of the server, either by kill -HUP (usually -1), pg_ctl reload, or select pg_reload_conf();<br />
* User: can be set within individual sessions, take effect only within that session<br />
* Internal: set at compile time, can't be changed, mainly for reference<br />
* Backend: settings which must be set before session start<br />
* Superuser: can be set at runtime for the server by superusers<br />
<br />
Most of the time you'll only use the first of these, but the second can be useful if you have a server you don't want to take down, while the user session settings can be helpful for some special situations. You can tell which type of parameter a setting is by looking at the "context" field in the pg_settings view.<br />
<br />
=== Important notes about postgresql.conf ===<br />
<br />
* You should be able to find it at $PGDATA/postgresql.conf; watch out for symbolic links and other trickiness<br />
* You can figure out the file location with ''SHOW config_file''<br />
* Lines with # are comments and have no effect. For a new database, this will mean the setting is using the default, but on running systems this may not hold true! Changes to the postgresql.conf do not take effect without a reload/restart, so it's possible for the system to be running something different from what is in the file. <br />
* If the same setting is listed multiple times, the last one wins.<br />
<br />
=== Viewing the current settings === <br />
<br />
* Look in postgresql.conf. This works if you follow good practice, but it's not definitive!<br />
* ''show all'', ''show <setting>'' will show you the current value of the setting. Watch out for session specific changes<br />
* ''select * from pg_settings'' will label session specific changes as locally modified<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-connection.html#GUC-LISTEN-ADDRESSES listen_addresses] ==<br />
<br />
By default, PostgreSQL only responds to connections from the local host. If you want your server to be accessible from other systems via standard TCP/IP networking, you need to change listen_addresses from its default. The usual approach is to set it to listen to all addresses like this:<br />
<br />
<code><pre><br />
listen_addresses = '*'<br />
</pre></code><br />
<br />
And then control who can and can connect via the [http://www.postgresql.org/docs/current/static/auth-pg-hba-conf.html pg_hba.conf] file.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-connection.html#GUC-MAX-CONNECTIONS max_connections]==<br />
<br />
max_connections sets exactly that: the maximum number of client connections allowed. This is very important to some of the below parameters (particularly work_mem) because there are some memory resources that are or can be allocated on a per-client basis, so the maximum number of clients suggests the maximum possible memory use. Generally, PostgreSQL on good hardware can support a few hundred connections. If you want to have thousands instead, you should consider using [[Replication, Clustering, and Connection Pooling|connection pooling software]] to reduce the connection overhead.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-SHARED-BUFFERS shared_buffers]==<br />
<br />
The shared_buffers configuration parameter determines how much memory is dedicated to PostgreSQL to use for caching data. One reason the defaults are low is because on some platforms (like older Solaris versions and SGI), having large values requires invasive action like recompiling the kernel. Even on a modern Linux system, the stock kernel will likely not allow setting shared_buffers to over 32MB without adjusting kernel settings first. (PostgreSQL 9.3 and later use a different shared memory mechanism, so kernel settings will usually not have to be adjusted there.)<br />
<br />
If you have a system with 1GB or more of RAM, a reasonable starting value for shared_buffers is 1/4 of the memory in your system. If you have less RAM you'll have to account more carefully for how much RAM the OS is taking up; closer to 15% is more typical there. There are some workloads where even larger settings for shared_buffers are effective, but given the way PostgreSQL also relies on the operating system cache, it's unlikely you'll find using more than 40% of RAM to work better than a smaller amount.<br />
<br />
Be aware that if your system or PostgreSQL build is 32-bit, it might not be practical to set shared_buffers above 2 ~ 2.5GB. See [http://rhaas.blogspot.jp/2011/05/sharedbuffers-on-32-bit-systems.html this blog post] for details.<br />
<br />
Note that on Windows, large values for shared_buffers aren't as effective, and you may find better results keeping it relatively low and using the OS cache more instead. On Windows the useful range is 64MB to 512MB.<br />
<br />
If you are running PostgreSQL 9.2 or earlier, it's likely you will have to increase the amount of memory your operating system allows you to allocate at once to set the value for shared_buffers this high. On UNIX-like systems, if you set it above what's supported, you'll get a message like this:<br />
<br />
<code><pre><br />
IpcMemoryCreate: shmget(key=5432001, size=415776768, 03600) failed: Invalid argument <br />
<br />
This error usually means that PostgreSQL's request for a shared memory <br />
segment exceeded your kernel's SHMMAX parameter. You can either <br />
reduce the request size or reconfigure the kernel with larger SHMMAX. <br />
To reduce the request size (currently 415776768 bytes), reduce <br />
PostgreSQL's shared_buffers parameter (currently 50000) and/or <br />
its max_connections parameter (currently 12).<br />
</pre></code><br />
<br />
See [http://www.postgresql.org/docs/current/static/kernel-resources.html Managing Kernel Resources] for details on how to correct this.<br />
<br />
Changing this setting requires restarting the database. Also, this is a hard allocation of memory; the whole thing gets allocated out of virtual memory when the database starts.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-EFFECTIVE-CACHE-SIZE effective_cache_size]==<br />
<br />
effective_cache_size should be set to an estimate of how much memory is available for disk caching by the operating system and within the database itself, after taking into account what's used by the OS itself and other applications. This is a guideline for how much memory you expect to be available in the OS and PostgreSQL buffer caches, not an allocation! This value is used only by the PostgreSQL query planner to figure out whether plans it's considering would be expected to fit in RAM or not. If it's set too low, indexes may not be used for executing queries the way you'd expect. The setting for shared_buffers is not taken into account here--only the effective_cache_size value is, so it should include memory dedicated to the database too.<br />
<br />
Setting effective_cache_size to 1/2 of total memory would be a normal conservative setting, and 3/4 of memory is a more aggressive but still reasonable amount. You might find a better estimate by looking at your operating system's statistics. On UNIX-like systems, add the free+cached numbers from free or top to get an estimate. On Windows see the "System Cache" size in the Windows Task Manager's Performance tab. Changing this setting does not require restarting the database (HUP is enough).<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-CHECKPOINTS checkpoint_segments checkpoint_completion_target]==<br />
<br />
PostgreSQL writes new transactions to the database in files called WAL segments that are 16MB in size. Every time checkpoint_segments worth of these files have been written, by default 3, a checkpoint occurs. Checkpoints can be resource intensive, and on a modern system doing one every 48MB will be a serious performance bottleneck. Setting checkpoint_segments to a much larger value improves that. Unless you're running on a very small configuration, you'll almost certainly be better setting this to at least 10, which also allows usefully increasing the completion target.<br />
<br />
For more write-heavy systems, values from 32 (checkpoint every 512MB) to 256 (every 4GB) are popular nowadays. Very large settings use a lot more disk and will cause your database to take longer to recover, so make sure you're comfortable with both those things before large increases. Normally the large settings (>64/1GB) are only used for bulk loading. Note that whatever you choose for the segments, you'll still get a checkpoint at least every 5 minutes unless you also increase checkpoint_timeout (which isn't necessary on most systems).<br />
<br />
Checkpoint writes are spread out a bit while the system starts working toward the next checkpoint. You can spread those writes out further, lowering the average write overhead, by increasing the checkpoint_completion_target parameter to its useful maximum of 0.9 (aim to finish by the time 90% of the next checkpoint is here) rather than the default of 0.5 (aim to finish when the next one is 50% done). A setting of 0 gives something similar to the behavior of obsolete versions. The main reason the default isn't just 0.9 is that you need a larger checkpoint_segments value than the default for broader spreading to work well. For lots more information on checkpoint tuning, see [http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm Checkpoints and the Background Writer] (where you'll also learn why tuning the background writer parameters is challenging to do usefully).<br />
<br />
==[http://www.postgresql.org/docs/current/static/routine-vacuuming.html#AUTOVACUUM autovacuum]==<br />
<br />
The autovacuum process takes care of several maintenance chores inside your database that you really need. Generally, if you think you need to turn regular vacuuming off because it's taking too much time or resources, that means you're doing it wrong. The answer to almost all vacuuming problems is to vacuum more often, not less, so that each individual vacuum operation has less to clean up.<br />
<br />
However, it's acceptable to disable autovacuum for short periods of time, for instance when bulk loading large amounts of data.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-logging.html logging]==<br />
There are many things you can log that may or may not be important to you. You should investigate the documentation on all of the options, but here are some tips & tricks to get you started:<br />
<br />
*pgFouine is a tool used to analyze postgresql logs for performance tuning. If you plan to use this tool, it has specific logging requirements. Please check http://pgfouine.projects.postgresql.org/<br />
<br />
*pgFouine has been obsoleted by [http://dalibo.github.com/pgbadger PgBadger]<br />
<br />
*[https://github.com/darold/pgcluu PgCluu] is an handy tool from the author of PgBadger, and is a PostgreSQL performances monitoring and auditing tool.<br />
<br />
*log_destination & log_directory (& log_filename): What you set these options to is not as important as knowing they can give you hints to determine where your database server is logging to. Best practice would be to try and make this as similar as possible across your servers. Note that in some cases, the init script starting your database may be customizing the log destination in the command line used to start the database, overriding what's in the postgresql.conf (and making it so you'll get different behavior if you run pg_ctl manually instead of using the init script).<br />
<br />
*log_min_error_statement: You should probably make sure this is at least on error, so that you will see any SQL commands which cause an error. should be the default on recent versions. <br />
<br />
*log_min_duration_statement: Not necessary for everyday use, but this can generate [[Logging Difficult Queries|logs of "slow queries"]] on your system. <br />
<br />
*log_line_prefix: Appends information to the start of each line. A good generic recommendation is '%t:%r:%u@%d:[%p]: ' : %t=timestamp, %u=db user name, %r=host connecting from, %d=database connecting to, %p=PID of connection. It may not be obvious what the PID is useful at first, but it can be vital for trying to troubleshoot problems in the future so better to put in the logs from the start.<br />
<br />
*log_statement: Choices of none, ddl, mod, all. Using all in production leads to severe performance penalties. DDL can sometime be helpful to discover rogue changes made outside of your recommend processes, by "cowboy DBAs" for example.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET default_statistics_target]==<br />
<br />
The database software collects statistics about each of the tables in your database to decide how to execute queries against it. If you're not getting good execution query plans particularly on larger (or more varied) tables you should increase default_statistics_target then ANALYZE the database again (or wait for autovacuum to do it for you). <br />
<br />
;PostgreSQL 8.4 and later<br />
<br />
The starting default_statistics_target value was raised from 10 to 100 in PostgreSQL 8.4. Increases beyond 100 may still be useful, but this increase makes for greatly improved statistics estimation in the default configuration. The maximum value for the parameter was also increased from 1000 to 10,000 in 8.4.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-WORK-MEM work_mem maintainance_work_mem]==<br />
<br />
If you do a lot of complex sorts, and have a lot of memory, then increasing the work_mem parameter allows PostgreSQL to do larger in-memory sorts which, unsurprisingly, will be faster than disk-based equivalents.<br />
<br />
This size is applied to each and every sort done by each user, and complex queries can use multiple working memory sort buffers. Set it to 50MB, and have 30 users submitting queries, and you are soon using 1.5GB of real memory. Furthermore, if a query involves doing merge sorts of 8 tables, that requires 8 times work_mem. You need to consider what you set max_connections to in order to size this parameter correctly. This is a setting where data warehouse systems, where users are submitting very large queries, can readily make use of many gigabytes of memory.<br />
<br />
maintenance_work_mem is used for operations like vacuum. Using extremely large values here doesn't help very much, and because you essentially need to reserve that memory for when vacuum kicks in, takes it away from more useful purposes. Something in the 256MB range has anecdotally been a reasonably large setting here.<br />
<br />
You can use log_temp_files to figure out if sorts are using disk instead of fitting in memory. You can see sorts to disk happen in EXPLAIN ANALYZE plans as well. For example, if you see a line like "Sort Method: external merge Disk: 7526kB" in there, you'd know a work_mem of at least 8MB would really improve how fast that query executed, by sorting in RAM instead of swapping to disk.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-WAL-SYNC-METHOD wal_sync_method wal_buffers]==<br />
<br />
After every transaction, PostgreSQL forces a commit to disk out to its write-ahead log. This can be done a couple of ways, and on some platforms the other options are considerably faster than the conservative default. open_sync is the most common non-default setting switched to, on platforms that support it but default to one of the fsync methods. See [http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm Tuning PostgreSQL WAL Synchronization] for a lot of background on this topic. Note that open_sync writing is buggy on some platforms (such as [http://lwn.net/Articles/350219/ Linux]), and you should (as always) do plenty of tests under a heavy write load to make sure that you haven't made your system less stable with this change. [[Reliable Writes]] contains more information on this topic.<br />
<br />
Linux kernels starting with version 2.6.33 will cause earlier versions of PostgreSQL to default to wal_sync_method=open_datasync; before that kernel release the default picked was always fdatasync. This can cause a significant performance decrease when combined with small writes and/or small values for wal_buffers.<br />
<br />
Increasing wal_buffers from its tiny default of a small number of kilobytes is helpful for write-heavy systems. Benchmarking generally suggests that just increasing to 1MB is enough for some large systems, and given the amount of RAM in modern servers allocating a full WAL segment (16MB, the useful upper-limit here) is reasonable. Changing wal_buffers requires a database restart.<br />
<br />
;PostgreSQL 9.1 and later<br />
<br />
Starting with PostgreSQL 9.1 wal_buffers defaults to being 1/32 of the size of shared_buffers, with an upper limit of 16MB (reached when shared_buffers=512MB).<br />
<br />
PostgreSQL 9.1 also changes the logic for selecting the default wal_sync_method such that on newer Linux kernels, it will still select fdatasync as its method--the same as on older Linux versions.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-CONSTRAINT-EXCLUSION constraint_exclusion]==<br />
<br />
<tt>constraint_exclusion</tt> now defaults to a new choice: <tt>partition</tt>. This will only enable constraint exclusion for partitioned tables which is the right thing to do in nearly all cases.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-resource.html#GUC-MAX-PREPARED-TRANSACTIONS max_prepared_transactions]==<br />
<br />
This setting is used for managing 2 phase commit. If you do not use two phase commit (and if you don't know what it is, you don't use it), then you can set this value to 0. That will save a little bit of shared memory. For database systems with a large number (at least hundreds) of concurrent connections, be aware that this setting also affects the number of available lock-slots in pg_locks, so you may want to leave it at the default setting. There is a formula for how much memory gets allocated [http://www.postgresql.org/docs/current/static/kernel-resources.html#SHARED-MEMORY-PARAMETERS in the docs] and in the default postgresql.conf.<br />
<br />
Changing max_prepared_transactions requires a server restart.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT synchronous_commit]==<br />
PostgreSQL can only safely use a write cache if it has a battery backup. See [http://www.postgresql.org/docs/current/static/wal-reliability.html WAL reliability] for an essential introduction to this topic. No, really; go read that right now, it's vital to understand that if you want your database to work right.<br />
<br />
You may be limited to approximately 100 transaction commits per second per client in situations where you don't have such a durable write cache (and perhaps only 500/second even with lots of clients).<br />
<br />
For situations where a small amount of data loss is acceptable in return for a large boost in how many updates you can do to the database per second, consider switching synchronous commit off. This is particularly useful in the situation where you do not have a battery-backed write cache on your disk controller, because you could potentially get thousands of commits per second instead of just a few hundred.<br />
<br />
For obsolete versions of PostgreSQL, you may find people recommending that you set ''fsync=off'' to speed up writes on busy systems. This is dangerous--a power loss could result in your database getting corrupted and not able to start again. Synchronous commit doesn't introduce the risk of ''corruption'', which is really bad, just some risk of data ''loss''.<br />
<br />
==[http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-RANDOM-PAGE-COST random_page_cost]==<br />
This setting suggests to the optimizer how long it will take your disks to seek to a random disk page, as a multiple of how long a sequential read (with a cost of 1.0) takes. If you have particularly fast disks, as commonly found with RAID arrays of SCSI disks, it may be appropriate to lower random_page_cost, which will encourage the query optimizer to use random access index scans. Some feel that 4.0 is always too large on current hardware; it's not unusual for administrators to standardize on always setting this between 2.0 and 3.0 instead. In some cases that behavior is a holdover from earlier PostgreSQL versions where having random_page_cost too high was more likely to screw up plan optimization than it is now (and setting at or below 2.0 was regularlly necessary). Since these cost estimates are just that--estimates--it shouldn't hurt to try lower values.<br />
<br />
But this not where you should start to search for plan problems. Note that random_page_cost is pretty far down this list (at the end in fact). If you are getting bad plans, this shouldn't be the first thing you look at, even though lowering this value may be effective. Instead, you should start by making sure autovacuum is working properly, that you are collecting enough statistics, and that you have correctly sized the memory parameters for your server--all the things gone over above. After you've done all those much more important things, if you're still getting bad plans ''then'' you should see if lowering random_page_cost is still useful.<br />
<br />
[[Category:Administration]] [[Category:Performance]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Unnest_multidimensional_array&diff=21472Unnest multidimensional array2013-11-28T15:52:09Z<p>Brick: Created page with "{{SnippetInfo|Unnest a multidimensional array|version=9.1+|lang=SQL|category=Library}} Purpose: To emulate the [http://www.postgresql.org/docs/current/static/functions-array...."</p>
<hr />
<div>{{SnippetInfo|Unnest a multidimensional array|version=9.1+|lang=SQL|category=Library}}<br />
<br />
Purpose: To emulate the [http://www.postgresql.org/docs/current/static/functions-array.html unnest] functionality for a multi-dimensional array.<br />
Courtesy of Pavel Stehule, from http://markmail.org/message/pe5t3l6kaa64llfg<br />
<br />
== Function ==<br />
<br />
<source lang="sql"><br />
CREATE OR REPLACE FUNCTION public.reduce_dim(anyarray)<br />
RETURNS SETOF anyarray AS<br />
$function$<br />
DECLARE<br />
s $1%type;<br />
BEGIN<br />
FOREACH s SLICE 1 IN ARRAY $1 LOOP<br />
RETURN NEXT s;<br />
END LOOP;<br />
RETURN;<br />
END;<br />
$function$<br />
LANGUAGE plpgsql IMMUTABLE;<br />
</source><br />
<br />
<br />
<br />
<source lang="sql"><br />
select reduce_dim(array[array[1, 2], array[2, 3], array[4,5], array[9,10]]);<br />
reduce_dim<br />
------------<br />
{1,2}<br />
{2,3}<br />
{4,5}<br />
{9,10}<br />
</source><br />
<br />
<br />
[[Category:SQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Extract_days_from_range_type&diff=21040Extract days from range type2013-10-27T18:11:09Z<p>Brick: Created page with "== Extract Days or Interval from a Range type value == {{SnippetInfo|Extract days from Range type|lang=SQL|version=9.2+|category=Library}} To determine the the interval betw..."</p>
<hr />
<div>== Extract Days or Interval from a Range type value ==<br />
<br />
{{SnippetInfo|Extract days from Range type|lang=SQL|version=9.2+|category=Library}}<br />
<br />
To determine the the interval between the lower and upper bound of a timestamp(tz) range, or to return the number of days between the lower and lower bounds.<br />
<br />
== Functions ==<br />
Two sets of overloaded functions, the first to extract the interval, the second set to extract the days.<br />
Postgresql docs about Range types: http://www.postgresql.org/docs/current/static/rangetypes.html<br />
<br />
<br />
<source lang="sql"><br />
CREATE OR REPLACE FUNCTION extract_interval(TSTZRANGE) RETURNS interval AS<br />
$func$<br />
select upper($1) - lower($1);<br />
$func$ LANGUAGE sql STABLE;<br />
<br />
CREATE OR REPLACE FUNCTION extract_interval(TSRANGE) RETURNS interval AS<br />
$func$<br />
select upper($1) - lower($1);<br />
$func$ LANGUAGE sql STABLE;<br />
<br />
<br />
CREATE OR REPLACE FUNCTION extract_days(TSTZRANGE) RETURNS integer AS<br />
$func$<br />
select (date_trunc('day',upper($1))::DATE - date_trunc('day',lower($1))::DATE) + 1;<br />
$func$ LANGUAGE sql;<br />
<br />
CREATE OR REPLACE FUNCTION extract_days(TSRANGE) RETURNS integer AS<br />
$func$<br />
select (date_trunc('day',upper($1))::DATE - date_trunc('day',lower($1))::DATE) + 1;<br />
$func$ LANGUAGE sql;<br />
</source><br />
<br />
<br />
== Example ==<br />
<source lang="sql"><br />
WITH tzr AS (SELECT tstzrange('2013-10-01 10:00-07', '2013-10-03 05:15-07') AS dttz, tstzrange('2013-10-01 10:00', '2013-10-03 05:15') AS dt)<br />
SELECT extract_interval(dttz) as interval_with_tz,<br />
extract_interval(dt) as interval_no_tz,<br />
extract_days(dttz) as days_with_tz,<br />
extract_days(dt) as days_no_tz<br />
FROM tzr;<br />
<br />
interval_with_tz | interval_no_tz | days_with_tz | days_no_tz <br />
------------------+----------------+--------------+------------<br />
1 day 19:15:00 | 1 day 19:15:00 | 3 | 3<br />
<br />
</source></div>Brickhttps://wiki.postgresql.org/index.php?title=ArrXor&diff=21039ArrXor2013-10-27T18:04:14Z<p>Brick: </p>
<hr />
<div>== Array XOR (Symmetric Difference) ==<br />
<br />
{{SnippetInfo|ArrXor|lang=SQL|version=9.0+|category=Library}}<br />
<br />
Takes two arrays and returns the elements not found in both array (akin to using EXCEPT between two arrays).<br />
Original discussion can be found here: http://markmail.org/message/o2f5hvq5vpxqzou7<br />
<br />
Tested in PostgreSQL 9.0+<br />
<br />
"In mathematics, the symmetric difference of two sets is the set of elements which are in either of the sets and not in their intersection. The symmetric difference of the sets A and B is commonly denoted by A Δ B". From http://en.wikipedia.org/wiki/Symmetric_difference<br />
<br />
<source lang="sql"><br />
create or replace function arrxor(anyarray,anyarray) returns anyarray as $$<br />
select ARRAY(<br />
(<br />
select r.elements<br />
from (<br />
(select 1,unnest($1))<br />
union all<br />
(select 2,unnest($2))<br />
) as r (arr, elements)<br />
group by 1<br />
having min(arr) = max(arr)<br />
)<br />
)<br />
$$ language sql strict immutable;<br />
</source></div>Brickhttps://wiki.postgresql.org/index.php?title=Lock_database&diff=20670Lock database2013-08-26T14:36:13Z<p>Brick: </p>
<hr />
<div>== LOCK DATABASE ==<br />
<br />
LOCK DATABASE is a database-level lock mechanism.<br />
This functionality '''is not implemented in PostgreSQL''', and is not planned to be yet.<br />
<br />
=== Synopsis ===<br />
<br />
LOCK DATABASE db_name [IN dblockmode MODE] [NOWAIT];<br />
UNLOCK DATABASE db_name;<br />
<br />
dblockmode having the following options:<br />
SHARE | EXCLUSIVE<br />
<br />
EXCLUSIVE excludes all SQL query to be run on the database. No connection is allowed from external applications.<br />
SHARE authorizes SELECT queries and connections to the database. DDL and DML are blocked.<br />
<br />
=== Characteristics ===<br />
<br />
LOCK DATABASE is a database-level lock, waiting for any locks to be released before taking it.<br />
For example, if lock is taken on a table of this database, LOCK DATABASE waits until it is released.<br />
When waiting for locking, new table locks of this database cannot be taken.<br />
<br />
It is also possible to set NOWAIT what makes LOCK DATABASE to return an error if lock cannot be taken immediately.<br />
mechanism to ensure that no transaction can be run on this database or no user can access to this database when lock is taken.<br />
<br />
Lock can be released with command UNLOCK DATABASE or when session that used this database is disconnected.<br />
PostgreSQL always takes the least level of lock when using such mechanism.<br />
LOCK DATABASE can be used if database administrator or superuser needs more restrictive lock mechanism.<br />
<br />
=== Usage ===<br />
<br />
* Server maintenance:<br />
instead of taking multiple locks on tables of a database.<br />
* Cluster of nodes:<br />
PostgreSQL does not ensure a DROP DATABASE query if done on multiple nodes at the same time as query is autocommit.<br />
DROP DATABASE can easily fail on a node if an application keeps a connection alive on it.<br />
However DROP succeeds if no session are alive on this node of the cluster.<br />
In this case LOCK DATABASE can be used as a 2PC like mechanism to ensure database drop in a cluster of nodes.<br />
<br />
<br />
[[Category:Feature request]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Unindexed_foreign_keys&diff=20397Unindexed foreign keys2013-07-10T00:28:32Z<p>Brick: Added QUOTE_IDENT() around schemas, tables, and columns (suggestion from Scott Ribe).</p>
<hr />
<div>== Unindexed Foreign Keys ==<br />
<br />
{{SnippetInfo|Unindexed foreign keys|lang=SQL|version=... – 9.2|category=Administrative}}<br />
<br />
I'm not sure of the original source of this query, but it might be from Michael Fuhr in the pgsql-performance mailing lists back in 2007.<br />
<br />
It has been modified in several ways:<br />
* Size of referencing and referenced tables added,<br />
* Suggestion on index to add,<br />
* Searches only for single-column or compound indexes where the leading column is the referencing key.<br />
<br />
Tested in PostgreSQL 9.2 (might work with earlier versions).<br />
<br />
<source lang="sql"><br />
create or replace view admin.unindexed_foreign_keys as<br />
select referencing_tbl,<br />
referencing_column,<br />
existing_fk_on_referencing_tbl,<br />
referenced_tbl,<br />
referenced_column,<br />
pg_size_pretty(referencing_tbl_bytes) as referencing_tbl_size,<br />
pg_size_pretty(referenced_tbl_bytes) as referenced_tbl_size,<br />
suggestion<br />
from (<br />
select (case when n1.nspname is not null then quote_ident(n1.nspname) else 'public' end) || '.' || quote_ident(c1.relname) as referencing_tbl,<br />
quote_ident(a1.attname) as referencing_column,<br />
t.conname as existing_fk_on_referencing_tbl,<br />
(case when n2.nspname is not null then quote_ident(n2.nspname) else 'public' end) || '.' || quote_ident(c2.relname) || '.' || quote_ident(a2.attname) as referenced_tbl,<br />
quote_ident(a2.attname) as referenced_column,<br />
pg_relation_size( ((case when n1.nspname is not null then quote_ident(n1.nspname) else 'public' end) || '.' || quote_ident(c1.relname)) ) as referencing_tbl_bytes,<br />
pg_relation_size( ((case when n2.nspname is not null then quote_ident(n2.nspname) else 'public' end) || '.' || quote_ident(c2.relname)) ) as referenced_tbl_bytes,<br />
'Create an index on column ' || quote_ident(a1.attname) || ' in table ' ||<br />
(case when n1.nspname is not null then quote_ident(n1.nspname) else 'public' end) || '.' || quote_ident(c1.relname) as suggestion<br />
from pg_constraint t<br />
join pg_attribute a1 on a1.attrelid = t.conrelid and a1.attnum = t.conkey[1]<br />
join pg_class c1 on c1.oid = t.conrelid<br />
join pg_namespace n1 on n1.oid = c1.relnamespace<br />
join pg_class c2 on c2.oid = t.confrelid<br />
join pg_namespace n2 on n2.oid = c2.relnamespace<br />
join pg_attribute a2 on a2.attrelid = t.confrelid and a2.attnum = t.confkey[1]<br />
where t.contype = 'f'<br />
and not exists<br />
(select 1<br />
from pg_index i<br />
where i.indrelid = t.conrelid<br />
and i.indkey[0] = t.conkey[1])<br />
) as y<br />
order by referencing_tbl_bytes desc, referenced_tbl_bytes desc, referencing_tbl,<br />
referenced_tbl, referencing_column, referenced_column;<br />
</source></div>Brickhttps://wiki.postgresql.org/index.php?title=ArrXor&diff=20365ArrXor2013-07-08T15:32:57Z<p>Brick: </p>
<hr />
<div>== Array XOR (Symmetric Difference) ==<br />
<br />
{{SnippetInfo|ArrXor|lang=SQL|version= 9.0|category=Library}}<br />
<br />
Takes two arrays and returns the elements not found in both array (akin to using EXCEPT between two arrays).<br />
Original discussion can be found here: http://markmail.org/message/o2f5hvq5vpxqzou7<br />
<br />
Tested in PostgreSQL 9.0+<br />
<br />
"In mathematics, the symmetric difference of two sets is the set of elements which are in either of the sets and not in their intersection. The symmetric difference of the sets A and B is commonly denoted by A Δ B". From http://en.wikipedia.org/wiki/Symmetric_difference<br />
<br />
<source lang="sql"><br />
create or replace function arrxor(anyarray,anyarray) returns anyarray as $$<br />
select ARRAY(<br />
(<br />
select r.elements<br />
from (<br />
(select 1,unnest($1))<br />
union all<br />
(select 2,unnest($2))<br />
) as r (arr, elements)<br />
group by 1<br />
having min(arr) = max(arr)<br />
)<br />
)<br />
$$ language sql strict immutable;<br />
</source></div>Brickhttps://wiki.postgresql.org/index.php?title=ArrXor&diff=20364ArrXor2013-07-08T15:32:37Z<p>Brick: Created page with "== Array XOR (Symmetric Difference) == {{SnippetInfo|ArrXor|lang=SQL|version=... - 9.0|category=Library}} Takes two arrays and returns the elements not found in both array (..."</p>
<hr />
<div>== Array XOR (Symmetric Difference) ==<br />
<br />
{{SnippetInfo|ArrXor|lang=SQL|version=... - 9.0|category=Library}}<br />
<br />
Takes two arrays and returns the elements not found in both array (akin to using EXCEPT between two arrays).<br />
Original discussion can be found here: http://markmail.org/message/o2f5hvq5vpxqzou7<br />
<br />
Tested in PostgreSQL 9.0+<br />
<br />
"In mathematics, the symmetric difference of two sets is the set of elements which are in either of the sets and not in their intersection. The symmetric difference of the sets A and B is commonly denoted by A Δ B". From http://en.wikipedia.org/wiki/Symmetric_difference<br />
<br />
<source lang="sql"><br />
create or replace function arrxor(anyarray,anyarray) returns anyarray as $$<br />
select ARRAY(<br />
(<br />
select r.elements<br />
from (<br />
(select 1,unnest($1))<br />
union all<br />
(select 2,unnest($2))<br />
) as r (arr, elements)<br />
group by 1<br />
having min(arr) = max(arr)<br />
)<br />
)<br />
$$ language sql strict immutable;<br />
</source></div>Brickhttps://wiki.postgresql.org/index.php?title=User:Brick&diff=20363User:Brick2013-07-08T15:04:43Z<p>Brick: Created page with "bricklen @ gmail.com"</p>
<hr />
<div>bricklen @ gmail.com</div>Brickhttps://wiki.postgresql.org/index.php?title=Streaming_Replication&diff=20359Streaming Replication2013-07-08T14:35:00Z<p>Brick: </p>
<hr />
<div>'''Streaming Replication''' (SR) provides the capability to continuously ship and<br />
apply the [http://www.postgresql.org/docs/current/static/wal.html WAL XLOG] records to some number of standby servers in order to keep them current.<br />
<br />
This feature was added to PostgreSQL 9.0. The discussion below is a developer oriented one that contains some out of date information. Users of this feature should use the documentation for the feature or a setup tutorial instead:<br />
<br />
* [http://www.postgresql.org/docs/current/static/warm-standby.html PostgreSQL Streaming Replication Documentation] - this documentation is for the latest version, but provides links you can use to look up the docs for your version.<br />
<br />
* [[Binary Replication Tutorial]] provides an introduction to using this replication feature.<br />
<br />
* The related but independent [[Hot Standby]] feature.<br />
<br />
<br />
= Developer and historical details on the project =<br />
SR was developed for inclusion in PostgreSQL 9.0 by NTT OSS Center. The lead developer is [mailto:masao.fujii@gmail.com Masao Fujii]. [http://www.pgcon.org/2008/schedule/events/76.en.html Synchronous Log Shipping Replication Presentation] introduces the early design of the feature.<br />
<br />
= Usage =<br />
== Users Overview ==<br />
* '''Log-shipping'''<br />
** XLOG records generated in the primary are periodically shipped to the standby via the network.<br />
** In the existing warm standby, only records in a filled file are shipped, what's referred to as file-based log-shipping. In SR, XLOG records in partially-filled XLOG file are shipped too, implementing record-based log-shipping. This means the window for data loss in SR is usually smaller than in warm standby, unless the warm standby was also configured for record-based shipping (which is complicated to setup).<br />
** The content of XLOG files written to the standby are exactly the same as those on the primary. XLOG files shipped can be used for a normal recovery and PITR.<br />
* '''Multiple standbys'''<br />
** More than one standby can establish a connection to the primary for SR. XLOG records are concurrently shipped to all these standbys. The delay/death of a standby does not harm log-shipping to other standbys.<br />
** The maximum number of standbys can be specified as a GUC variable.<br />
* '''Continuous recovery'''<br />
** The standby continuously replays XLOG records shipped without using pg_standby.<br />
** XLOG records shipped are replayed as soon as possible without waiting until XLOG file has been filled. The combination of [[Hot Standby]] and SR would make the latest data inserted into the primary visible in the standby almost immediately.<br />
** The standby periodically removes old XLOG files which are no longer needed for recovery, to prevent excessive disk usage.<br />
* '''Setup'''<br />
** The start of log-shipping does not interfere with any query processing on the primary.<br />
** The standby can be started in various conditions.<br />
*** If there are XLOG files in archive directory and restore_command is supplied, at first those files are replayed. Then the standby requests XLOG records following the last applied one to the primary. This prevents XLOG files already present in the standby from being shipped again. Similarly, XLOG files in pg_xlog are also replayed before starting log-shipping.<br />
*** If there is no XLOG files on the standby, the standby requests XLOG records following the starting XLOG location of recovery (the redo starting location).<br />
* '''Connection settings and authentication'''<br />
** A user can configure the same settings as a normal connection to a connection for SR (e.g., keepalive, pg_hba.conf).<br />
* '''Activation'''<br />
** The standby can keep waiting for activation as long as a user likes. This prevents the standby from being automatically brought up by failure of recovery or network outage.<br />
* '''Progress report'''<br />
** The primary and standby report the progress of log-shipping in PS display.<br />
* '''Graceful shutdown'''<br />
** When smart/fast shutdown is requested, the primary waits to exit until XLOG records have been sent to the standby, up to the shutdown checkpoint record.<br />
<br />
== Restrictions ==<br />
* '''Synchronous log-shipping'''<br />
** By default, SR supports operates in asynchronous manner, so the commit command might return a "success" to a client before the corresponding XLOG records are shipped to the standby. To enable synchronous replication, see [http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION Synchronous Replication]<br />
* '''Replication beyond timeline'''<br />
** A user has to get a fresh backup whenever making the old standby catch up.<br />
* '''Clustering'''<br />
** Postgres doesn't provide any clustering feature.<br />
<br />
== How to Use ==<br />
* '''1.''' Install postgres in the primary and standby server as usual. This requires only ''configure'', ''make'' and ''make install''.<br />
* '''2.''' Create the initial database cluster in the primary server as usual, using ''initdb''.<br />
* '''3.''' Set up connections and authentication so that the standby server can successfully connect to the ''replication'' pseudo-database on the primary.<br />
$ $EDITOR postgresql.conf<br />
<br />
listen_addresses = '192.168.0.10'<br />
<br />
$ $EDITOR pg_hba.conf<br />
<br />
# The standby server must have superuser access privileges.<br />
host replication postgres 192.168.0.20/22 trust<br />
* '''4.''' Set up the streaming replication related parameters on the primary server.<br />
$ $EDITOR postgresql.conf<br />
<br />
# To enable read-only queries on a standby server, wal_level must be set to<br />
# "hot_standby". But you can choose "archive" if you never connect to the<br />
# server in standby mode.<br />
wal_level = hot_standby<br />
<br />
# Set the maximum number of concurrent connections from the standby servers.<br />
max_wal_senders = 5<br />
<br />
# To prevent the primary server from removing the WAL segments required for<br />
# the standby server before shipping them, set the minimum number of segments<br />
# retained in the pg_xlog directory. At least wal_keep_segments should be<br />
# larger than the number of segments generated between the beginning of<br />
# online-backup and the startup of streaming replication. If you enable WAL<br />
# archiving to an archive directory accessible from the standby, this may<br />
# not be necessary.<br />
wal_keep_segments = 32<br />
<br />
# Enable WAL archiving on the primary to an archive directory accessible from<br />
# the standby. If wal_keep_segments is a high enough number to retain the WAL<br />
# segments required for the standby server, this is not necessary.<br />
archive_mode = on<br />
archive_command = 'cp %p /path_to/archive/%f'<br />
* '''5.''' Start postgres on the primary server.<br />
* '''6.''' Make a base backup by copying the primary server's data directory to the standby server.<br />
$ psql -c "SELECT pg_start_backup('label', true)"<br />
$ rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid<br />
$ psql -c "SELECT pg_stop_backup()"<br />
* '''7.''' Set up replication-related parameters, connections and authentication in the standby server like the primary, so that the standby might work as a primary after failover.<br />
* '''8.''' Enable read-only queries on the standby server. But if wal_level is ''archive'' on the primary, leave hot_standby unchanged (i.e., off).<br />
$ $EDITOR postgresql.conf<br />
<br />
hot_standby = on<br />
* '''9.''' Create a recovery command file in the standby server; the following parameters are required for streaming replication.<br />
$ $EDITOR recovery.conf<br />
# Note that recovery.conf must be in $PGDATA directory.<br />
<br />
# Specifies whether to start the server as a standby. In streaming replication,<br />
# this parameter must to be set to on.<br />
standby_mode = 'on'<br />
<br />
# Specifies a connection string which is used for the standby server to connect<br />
# with the primary.<br />
primary_conninfo = 'host=192.168.0.10 port=5432 user=postgres'<br />
<br />
# Specifies a trigger file whose presence should cause streaming replication to<br />
# end (i.e., failover).<br />
trigger_file = '/path_to/trigger'<br />
<br />
# Specifies a command to load archive segments from the WAL archive. If<br />
# wal_keep_segments is a high enough number to retain the WAL segments<br />
# required for the standby server, this may not be necessary. But<br />
# a large workload can cause segments to be recycled before the standby<br />
# is fully synchronized, requiring you to start again from a new base backup.<br />
restore_command = 'cp /path_to/archive/%f "%p"'<br />
* '''10.''' Start postgres in the standby server. It will start streaming replication.<br />
* '''11.''' You can calculate the replication lag by comparing the current WAL write location on the primary with the last WAL location received/replayed by the standby. They can be retrieved using ''pg_current_xlog_location'' on the primary and the ''pg_last_xlog_receive_location''/''pg_last_xlog_replay_location'' on the standby, respectively.<br />
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)<br />
pg_current_xlog_location <br />
--------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_receive_location <br />
-------------------------------<br />
0/2000000<br />
(1 row)<br />
<br />
$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)<br />
pg_last_xlog_replay_location <br />
------------------------------<br />
0/2000000<br />
(1 row)<br />
* '''12.''' You can also check the progress of streaming replication by using ''ps'' command.<br />
# The displayed LSNs indicate the byte position that the standby server has<br />
# written up to in the xlogs.<br />
[primary] $ ps -ef | grep sender<br />
postgres 6879 6831 0 10:31 ? 00:00:00 postgres: wal sender process postgres 127.0.0.1(44663) streaming 0/2000000<br />
<br />
[standby] $ ps -ef | grep receiver<br />
postgres 6878 6872 1 10:31 ? 00:00:01 postgres: wal receiver process streaming 0/2000000<br />
* How to do failover<br />
** Create the trigger file in the standby after the primary fails.<br />
* How to stop the primary or the standby server<br />
** Shut down it as usual (''pg_ctl stop'').<br />
* How to restart streaming replication after failover<br />
** Repeat the operations from '''6th'''; making a fresh backup, some configurations and starting the original primary as the standby. The primary server doesn't need to be stopped during these operations.<br />
* How to restart streaming replication after the standby fails<br />
** Restart postgres in the standby server after eliminating the cause of failure.<br />
* How to disconnect the standby from the primary<br />
** Create the trigger file in the standby while the primary is running. Then the standby would be brought up.<br />
* How to re-synchronize the stand-alone standby after isolation<br />
** Shut down the standby as usual. And repeat the operations from '''6th'''.<br />
* If you have more than one slave, promoting one will break the other(s). Update their recovery.conf settings to point to the new master, set recovery_target_timeline to 'latest', scp/rsync the pg_xlog directory, and restart the slave.<br />
<br />
= Todo =<br />
== v9.0 ==<br />
<br />
Moved to [[PostgreSQL_9.0_Open_Items]]<br />
<br />
=== Committed ===<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01455.php Retrying from archive and some refactoring around Read/FetchRecord().] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00395.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg02601.php SR wrongly treats the WAL-boundary.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00396.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01715.php Adjust SR for some later changes about wal-skipping.] - [http://archives.postgresql.org/pgsql-committers/2010-01/msg00399.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00024.php VACUUM FULL unexpectedly writes an XLOG UNLOGGED record.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00038.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01754.php Add a message type header.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00037.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php Documentation: Add a new "Replication" chapter.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00115.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00350.php Failed assertion during recovery of partial WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00124.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00712.php A PANIC error might occur in the standby because of a partially-filled archived WAL file.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00137.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00330.php Improve the standby messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00140.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php pq_getbyte_if_available() is not working because the win32 socket emulation layer simply wasn't designed to deal with non-blocking sockets.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00198.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01488.php Walsender might emit unfit messages.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00239.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01236.php Streaming replication on win32, still broken.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00270.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg00992.php Create new section for recovery.conf.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00295.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01824.php Assertion failure in walreceiver.] - [http://archives.postgresql.org/pgsql-committers/2010-02/msg00356.php commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01717.php Forbid a startup of walsender during recovery, and emit a suitable message? Or allow walsender to be started also during recovery?] - [http://archives.postgresql.org/message-id/20100316090955.9A5107541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php How do we clean down the archive without using pg_standby?] - [http://archives.postgresql.org/message-id/20100318091718.BC14D7541D0@cvs.postgresql.org commit]<br />
* [http://archives.postgresql.org/pgsql-hackers/2010-02/msg01510.php File-based log shipping without pg_standby doesn't replay the WAL files in pg_xlog.] - [http://archives.postgresql.org/pgsql-committers/2010-03/msg00356.php commit]<br />
<br />
== v9.1 ==<br />
=== Synchronization capability ===<br />
* Introduce the replication mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'' and ''fsync''.<br />
** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
** ''recv'' or ''fsync'' makes transaction commit wait for XLOG to be received or fsynced by the standby, respectively.<br />
** (''apply'' makes transaction commit wait for XLOG to be replayed by the standby. This mode will be supported in v9.2 or later)<br />
** The replication mode is specified in recovery.conf of the standby as well as other parameters for replication.<br />
*** The startup process reads the replication mode from recovery.conf and shares it to walreceiver via new shared-memory variable.<br />
*** Walreceiver also shares it to walsender by using the replication handshake message (existing protocol needs to be extended).<br />
** Based on the replication mode, walreceiver sends the reply meaning that replication is done up to the specified location to the primary.<br />
*** In async, walreceiver doesn't need to send any reply other than end-of-replication message.<br />
*** In recv or fsync, walreceiver sends the reply just after receiving or flushing XLOG, respectively.<br />
*** New message type for the reply needs to be defined. The reply is sent as CopyData message.<br />
** Walreceiver writes all the outstanding XLOG to disk before shutting down.<br />
** Walsender receives the reply from the standby, updates the location of the last record replicated, and announces completion of replication.<br />
*** New shared-memory variable to keep that location is required.<br />
** When processing the commit command, backend waits for XLOG to be replicated to only the standbys which are in the recv or fsync replication mode.<br />
*** Also smart shutdown waits for XLOG of shutdown checkpoint to be replicated.<br />
* Required optimization<br />
** Walsender should send outstanding XLOG without waiting wal_sender_delay.<br />
*** When processing the commit command, backend signals walsender to send outstanding XLOG immediately.<br />
** Backend should exit the wait loop as soon as the reply arrives at the primary.<br />
*** When receiving the reply, walsender signals backends to get up from the sleep and determine whether to exit the wait loop by checking the location of the last XLOG replicated.<br />
*** Only backends waiting for XLOG to be replicated up to the location contained in the reply are sent the signal.<br />
** Walsender waits for the signal from backends and the reply from the standby at the same time, by using select/poll.<br />
** Walsender reads XLOG from not only disk but also shared memory (wal buffers).<br />
** Walreceiver should flush XLOG file only when XLOG file is switched or the related page is flushed.<br />
*** When startup process or bgwriter flushes the buffer page, it checks whether the related XLOG has already been flushed via shared memory (location of the last XLOG flushed).<br />
*** It flushes the buffer page, if XLOG file has already been flushed.<br />
*** It signals walreceiver to flush XLOG file immediately and waits for the flush to complete, if XLOG file has not been flushed yet.<br />
** While the standby is catching up with the primary, those servers should ignore the replication mode and perform asynchronous replication.<br />
*** After those servers have almost gotten into synchronization, they perform replication based on the specified replication mode.<br />
*** New replication states like 'catching-up', 'sync', etc need to be defined, and the state machine for them is required on both servers.<br />
*** Current replication state can be monitored on both servers via SQL.<br />
* Required timeout<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG is replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
<br />
== Future release ==<br />
* '''Synchronization capability'''<br />
** Introduce the synchronization mode which can control how long transaction commit waits for replication before the commit command returns a "success" to a client. The valid modes are ''async'', ''recv'', ''fsync'' and ''apply''.<br />
*** ''async'' doesn't make transaction commit wait for replication, i.e., asynchronous replication.<br />
*** ''recv'', ''fsync'' and ''apply'' makes transaction commit wait for XLOG records to be received, fsynced and applied on the standby, respectively.<br />
** Change walsender to be able to read XLOG from not only the disk but also shared memory.<br />
** Add new parameter replication_timeout which is the maximum time to wait until XLOG records are replicated to the standby. (does this match http://www.postgresql.org/docs/current/interactive/runtime-config-replication.html ?)<br />
** Add new parameter (replication_timeout_action) to specify the reaction to replication_timeout.<br />
* '''Monitoring'''<br />
** Provide the capability to check the progress and gap of streaming replication via one query. A collaboration of HS and SR is necessary to provide that capability on the standby side.<br />
** Provide the capability to check if the specified repliation is in progress via a query. Also more detailed status information might be necessary, e.g, the standby is catching up now, has already gotten into sync, and so on.<br />
** Change the stats collector to collect the statistics information about replication, e.g., average delay of replication time.<br />
** Develop the tool to calculate the latest XLOG position from XLOG files. This is necessary to check the gap of replication after the server fails.<br />
** Also develop the tool to extract the user-readable contents from XLOG files. This is necessary to see the contents of the gap, and manually restore them.<br />
* '''Easy to Use'''<br />
** Introduce the parameters like:<br />
*** replication_halt_timeout - replication will halt if no data has been sent for this much time.<br />
*** replication_halt_segments - replication will halt if number of WAL files in pg_xlog exceeds this threshold.<br />
*** These parameters allow us to avoid disk overflow.<br />
** Add new feature which transfers also base backup via the direct connection between the primary and the standby.<br />
** Add new hooks like walsender_hook and walreceiver_hook to cooperate with the add-on program for compression like pglesslog.<br />
** Provide a graceful termination of replication via a query on the primary. On the standby, a trigger file mechanism already provides that capability.<br />
** Support replication beyond timeline. The timeline history files need to be shipped from the primary to the standby.<br />
* '''Robustness'''<br />
** Support keepalive in libpq. This is useful for a client and the standby to detect a failure of the primary immediately.<br />
** [http://archives.postgresql.org/pgsql-hackers/2010-01/msg01536.php New privilege for replication.]<br />
*** Currently superuser privilege is required when the standby connects to the primary. But there is the complaint that we should add new privilege for replication and use it instead of superuser because current approach is not good for security.<br />
* '''Miscellaneous'''<br />
** Standalone walreceiver tool, which connects to the primary, continuously receives and writes XLOG records, independently from postgres server.<br />
** Cascade streaming replication. Allow walsender to send XLOG to another standby during recovery.<br />
** WAL archiving during recovery.<br />
<br />
[[Category:Replication]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Unindexed_foreign_keys&diff=20358Unindexed foreign keys2013-07-07T23:23:58Z<p>Brick: Created page with " == Unindexed Foreign Keys == I'm not sure of the original source of this query, but it might be from Michael Fuhr in the pgsql-performance mailing lists back in 2007. It ha..."</p>
<hr />
<div><br />
== Unindexed Foreign Keys ==<br />
<br />
I'm not sure of the original source of this query, but it might be from Michael Fuhr in the pgsql-performance mailing lists back in 2007.<br />
<br />
It has been modified in several ways:<br />
* Size of referencing and referenced tables added,<br />
* Suggestion on index to add,<br />
* Searches only for single-column or compound indexes where the leading column is the referencing key.<br />
<br />
Tested in PostgreSQL 9.2 (might work with earlier versions).<br />
<br />
<br />
create or replace view unindexed_foreign_keys as<br />
select referencing_tbl,<br />
referencing_column,<br />
existing_fk_on_referencing_tbl,<br />
referenced_tbl,<br />
referenced_column,<br />
pg_size_pretty(referencing_tbl_bytes) as referencing_tbl_size,<br />
pg_size_pretty(referenced_tbl_bytes) as referenced_tbl_size,<br />
suggestion<br />
from (<br />
select (case when n1.nspname is not null then n1.nspname<br />
else 'public' end) || '.' || c1.relname as referencing_tbl,<br />
a1.attname as referencing_column,<br />
t.conname as existing_fk_on_referencing_tbl,<br />
(case when n2.nspname is not null then n2.nspname<br />
else 'public' end) || '.' || c2.relname as referenced_tbl,<br />
a2.attname as referenced_column,<br />
pg_relation_size( ((case when n1.nspname is not null then n1.nspname<br />
else 'public' end) || '.' || c1.relname) ) as referencing_tbl_bytes,<br />
pg_relation_size( ((case when n2.nspname is not null then n2.nspname<br />
else 'public' end) || '.' || c2.relname) ) as referenced_tbl_bytes,<br />
'Create an index on column "' || quote_ident(a1.attname) || '" in table "' ||<br />
(case when n1.nspname is not null then n1.nspname<br />
else 'public' end) || '.' || c1.relname || '"' as suggestion<br />
from pg_constraint t<br />
join pg_attribute a1 on a1.attrelid = t.conrelid and a1.attnum = t.conkey[1]<br />
join pg_class c1 on c1.oid = t.conrelid<br />
join pg_namespace n1 on n1.oid = c1.relnamespace<br />
join pg_class c2 on c2.oid = t.confrelid<br />
join pg_namespace n2 on n2.oid = c2.relnamespace<br />
join pg_attribute a2 on a2.attrelid = t.confrelid and a2.attnum = t.confkey[1]<br />
where t.contype = 'f'<br />
and not exists<br />
(select 1<br />
from pg_index i<br />
where i.indrelid = t.conrelid<br />
and i.indkey[0] = t.conkey[1])<br />
) as y<br />
order by referencing_tbl_bytes desc, referenced_tbl_bytes desc,<br />
referencing_tbl, referenced_tbl, referencing_column, referenced_column;</div>Brickhttps://wiki.postgresql.org/index.php?title=Binary_Replication_Tools&diff=20327Binary Replication Tools2013-07-02T21:40:04Z<p>Brick: Updated the pg_basebackup link to point to the current version.</p>
<hr />
<div>= Disclaimer =<br />
<br />
This is a Work-In-Progress, started Dec 28, 2012.<br />
<br />
Additions welcome, and [[User:Selena|Selena]] reserves the right to edit. :)<br />
<br />
= Purpose = <br />
<br />
Compare binary replication tools for PostgreSQL for features and ease of use. This document should classify and differentiate binary replication tools for easier selection and fit to purpose.<br />
<br />
== Comparison Matrix ==<br />
<br />
{|border=1<br />
!Tool!!Documentation!!License!!Makes base backups!!Makes base backups from replicas!!Manages backups!!Creates replicas!!Monitors replication delay!!Supports automated failover!!Transport used!!Source includes replication tests<br />
|- style="background-color:#ffffcc;"<br />
| [http://www.postgresql.org/docs/current/static/app-pgbasebackup.html pg_basebackup]*<br />
| [http://www.postgresql.org/docs/current/static/app-pgbasebackup.html Postgres docs]<br />
| PostgreSQL<br />
| Yes<br />
| Yes<br />
| No<br />
| Manual<br />
| No<br />
| No<br />
| PostgreSQL connection<br />
|<br />
|-<br />
| [http://www.pgbarman.org/ pgbarman] <br />
| [http://docs.pgbarman.org/ Documentation]<br />
| GPLv3<br />
| Yes<br />
| Yes<br />
| Yes<br />
| Manual<br />
| No<br />
| No<br />
| ssh<br />
|<br />
|-<br />
| [https://github.com/omniti-labs/omnipitr OmniPITR]<br />
| [https://github.com/omniti-labs/omnipitr/blob/master/doc/intro.pod Intro]<br />
| PostgreSQL<br />
| Yes<br />
| Yes<br />
| No<br />
| Manual<br />
| WAL archive delay<br />
| No<br />
| rsync / ssh<br />
|<br />
|-<br />
| [http://code.google.com/p/pg-rman/ pg-rman] <br />
| [http://code.google.com/p/pg-rman/wiki/readme Readme]<br />
| BSD<br />
| Yes<br />
| Yes<br />
| Yes<br />
| Manual<br />
| No<br />
| No<br />
| local / NFS mount<br />
|<br />
|-<br />
| [http://www.repmgr.org/ repmgr]<br />
| [https://github.com/2ndQuadrant/repmgr#readme Readme]<br />
| GPLv3<br />
| No<br />
| No<br />
| No<br />
| Yes<br />
| Yes<br />
| Yes<br />
| rsync / ssh<br />
|<br />
|-<br />
| [https://github.com/markokr/skytools Skytools]<br />
| [https://github.com/markokr/skytools/blob/master/doc/walmgr3.txt walmgr3]<br />
| BSD-ish<br />
|<br />
|<br />
|<br />
|<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
| [https://github.com/wal-e/WAL-E WAL-E]<br />
| [https://github.com/wal-e/WAL-E#readme Readme]<br />
| BSD<br />
| Yes<br />
| No<br />
| Yes<br />
| Manual<br />
| No<br />
| No<br />
| HTTPS/SSL<br />
| <br />
|-<br />
|}<br />
<br />
* pg_basebackup is included with a standard PostgreSQL installation.<br />
<br />
<br />
'''Tool''': name of the project that manages binary replication or WAL archiving<br />
<br />
'''Documentation''': Link to canonical documentation for the project. Several projects have broken links that show up as top results in Google.<br />
<br />
'''License''': License software is released under. So far, we only have open/free software projects listed. We could add commercial projects.<br />
<br />
'''Makes base backups''': Yes if the project supports creating binary archives, including the necessary WAL to restore a backed-up instance.<br />
<br />
'''Makes base backups from replicas''': Yes if the project supports creating binary archives and WAL using the PGDATA directory from a replica rather than the master database.<br />
<br />
'''Manages backups''': Yes if the project adds, removes and lists binary archives.<br />
<br />
'''Creates replicas''': Yes if the project automatically adds a recovery.conf (sets up replication) as part of restoring a base backup.<br />
<br />
'''Monitors replication delay''': Yes if the project supports monitoring of replication delay (WAL shipping or streaming replication).<br />
<br />
'''Supports automated failover''': Yes if the project has an option for detecting master failure and promoting a replica to master.<br />
<br />
'''Transport used''': Supported methods for file transfer for making backups or replicas<br />
<br />
= Barman = <br />
<br />
[http://www.pgbarman.org/ pgbarman] [https://github.com/selenamarie/pg_replication_demo/tree/master/barman Demo setup for pgbarman]<br />
<br />
Summary of features: <br />
* Creates base backups<br />
* Uses SSH as transport for backup<br />
* Configuration stored in file or command-line<br />
* Restore is automatable for creating replicas, although not explictly supported<br />
* GPLv3<br />
<br />
Install notes: <br />
* Had a dependency problem with 9.1 on Ubuntu Precise, so installed from source<br />
* Missing dep for argcomplete, noted in README in demo repo<br />
* written in Python<br />
<br />
= OmniPITR = <br />
<br />
[https://github.com/omniti-labs/omnipitr OmniPITR]<br />
[https://github.com/selenamarie/pg_replication_demo/tree/master/omnipitr Demo of a simple replication setup with OmniPITR]<br />
<br />
Summary of Features:<br />
* Creating PITR backups from Master or Slave<br />
* Restoring a PITR backup for DR<br />
* Creating replicas (by untarring backups)<br />
* Monitoring of replicas<br />
* Supports 'pause removal' of WAL during a backup (nice!)<br />
* PostgreSQL license<br />
<br />
Install notes: <br />
* No packaging, perl <br />
* No documented support for streaming replication<br />
* Uses '^' instead of '%' in custom logfile naming<br />
* No configuration file option (instead of using long command-line options)<br />
* Supported on all Linux, Solaris variants<br />
<br />
= pg-rman =<br />
<br />
[http://code.google.com/p/pg-rman/ pg-rman] [https://github.com/selenamarie/pg_replication_demo/tree/master/pgrman Simple demo for pg_rman setup]<br />
<br />
Summary of features: <br />
* Online backup and recovery, including backup from a replica<br />
* Archive management and restore<br />
* .ini configuration file<br />
* Simple command-line options<br />
<br />
Install notes: <br />
* Written in C, install with 'make USE_PGXS=1'<br />
* On Ubuntu/Debian, installs in non-default bin directory<br />
* Commands are simple, manages WAL archive as well as backups<br />
<br />
<br />
= repmgr =<br />
<br />
[http://www.repmgr.org/ repmgr]<br />
[https://github.com/selenamarie/pg_replication_demo/tree/master/repmgr Demo of a simple setup with repmgr]<br />
<br />
Supported features:<br />
* Setting up new replicas/hot_standby with streaming replication (makes recovery.conf itself)<br />
* Making base backups<br />
* Failover (automated, or not, including redirecting replicas to connect to a new master after failover)<br />
* Lag monitoring (repmgrd)<br />
* A "witness" DB server for monitoring (typically on a replica)<br />
* License: GPLv3<br />
<br />
Install notes: <br />
* Written in C<br />
* Installs like a typical UNIX utility out of postgresql/contrib source tree (make USE_PGXS=1; make USE_PGXS=1 install)<br />
* Developed on Debian systems, so support for package is present. Did not find an Ubuntu package for repmgr, however.<br />
* Detailed docs are in the README for installing on many Linux platforms<br />
* Doesn't appear to be supported on Mac OS X<br />
<br />
= Skytools / walmgr =<br />
<br />
[https://github.com/markokr/skytools Skytools]<br />
<br />
= WAL-E =<br />
<br />
[https://github.com/heroku/WAL-E WAL-E]</div>Brickhttps://wiki.postgresql.org/index.php?title=Using_EXPLAIN&diff=18642Using EXPLAIN2012-11-22T17:51:55Z<p>Brick: Web site no longer exists.</p>
<hr />
<div>{{Languages}}<br />
Figuring out why a statement is taking so long to execute is done with the [http://www.postgresql.org/docs/current/static/sql-explain.html EXPLAIN] command. You can run this two ways; if you use EXPLAIN ANALYZE, it will actually run the statement and let you compare what the planner thought was going to happen with what actually did. Note that if the statement changes data, that will also happen when you run with EXPLAIN ANALYZE; if you just use EXPLAIN the statement doesn't do anything to the database.<br />
<br />
There are some tools available to help interpret EXPLAIN output:<br />
* pgadmin includes a visual EXPLAIN tool that helps map out what's actually happening. See [http://www.postgresonline.com/journal/index.php?/archives/27-Reading-PgAdmin-Graphical-Explain-Plans.html Reading PgAdmin Graphical Explain Plans].<br />
* Visual Explain, originally a [http://sources.redhat.com/rhdb/visualexplain.html RedHat component] that has been kept current and improved by Enterprise DB, comes bundled with the EnterpriseDB Advanced Server package. It can be built to run against other PostgreSQL installations using the source code to their [http://www.enterprisedb.com/products/download.do Developer Studio] package.<br />
* [http://explain.depesz.com/ explain.depesz.com] shows explain plan with extracted summarized times, and highlights based on chosen criteria.<br />
<br />
And there are a few tutorials on this subject, many of which are titled "Explaining Explain" (sigh):<br />
* [http://www.postgresql.org/docs/current/interactive/performance-tips.html Performance Tips]<br />
* [http://www.gtsm.com/oscon2003/toc.html Efficient SQL] by Greg Sabino Mullane (2003)<br />
* Explaining Explain: [http://redivi.com/~bob/oscon2005_pgsql_pdf/OSCON_Explaining_Explain_Public.pdf pdf] [http://www.postgresql.org/communityfiles/13.sxi OpenOffice Presentation] by Robert Treat (2005)<br />
* [http://pooteeweet.org/files/phpworkso6/exlaining_explain.pdf Explaining Explain] by Lukas Smith (2006)<br />
* [http://wiki.postgresql.org/images/4/45/Explaining_EXPLAIN.pdf Explaining Explain] by Greg Stark (2008)<br />
* [http://neilconway.org/talks/executor.pdf Query Execution Techniques in PostgreSQL] by Neil Conway<br />
* [[Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT]] by Jim Nasby<br />
* [https://sites.google.com/site/robertmhaas/presentations The PostgreSQL Query Planner] by Robert Haas (2010)<br />
* [http://www.2ndquadrant.com/books/postgresql-9-0-high-performance/ PostgreSQL 9.0 High Performance] (2010) is a book with a long discussion of how to use EXPLAIN, read the resulting query plans, and make changes to get different plans.<br />
<br />
A common problem that causes the planner to make bad decisions is not keeping [[Planner Statistics]] updated. Another is leaving the tuning parameters that let the server know how memory is available at the very small defaults. For example, in the stock configuration, sorts that take more than 1MB are swapped to disk as being too big to process in memory. [[Tuning Your PostgreSQL Server]] covers good practices for sizing the memory and other tuning parameters that most impact query planning.<br />
<br />
One thing you do when stumped by a plan is to submit the full EXPLAIN ANALYZE output, along with the schema of the involved queries, to the [http://archives.postgresql.org/pgsql-performance/ pgsql-performance] mailing list. To get a useful reply more quickly, please read [[SlowQueryQuestions]] before posting. Note that if you're not running a relatively current version of PostgreSQL, it's quite possible the answer you'll get is that the problem is resolved in a later one. It may save you some time to try at least the most current update to the version you're using (i.e. upgrade to 8.2.6 if that's the current latest rev and you're using 8.2.3) and see if that improves the plan you get. <br />
<br />
An advanced technique here is to save your explain plans over time and see how they change as the amount of data in the table grows. This can give you an idea if you're collecting statistics aggressively enough. [http://people.planetpostgresql.org/greg/index.php?/archives/106-Putting-EXPLAIN-results-into-a-table.html Putting EXPLAIN results into a table] gives a technique for automating that. A simple pl/pgsql example is at [http://archives.postgresql.org/message-id/9359.1243290777@sss.pgh.pa.us generic options for explain].<br />
<br />
[[Category:FAQ]]<br />
[[Category:Performance]]<br />
[[Category:PGAdmin]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Pg_gethostname&diff=16088Pg gethostname2011-12-28T21:58:49Z<p>Brick: Created page with "{{SnippetInfo|C function to get server hostname |version=9.0+|lang=C|category=Library}} Here is a compilable C function to get the server hostname. Tested with CentOS 5 x86_64,…"</p>
<hr />
<div>{{SnippetInfo|C function to get server hostname |version=9.0+|lang=C|category=Library}}<br />
<br />
Here is a compilable C function to get the server hostname.<br />
<br />
Tested with CentOS 5 x86_64, postgresql 9.0 & 9.1.<br />
<br />
Probably works in earlier versions of pg and other linux distros, but haven't tested.<br />
<br />
<br />
<br />
<source lang="C"><br />
/*<br />
A PostgreSQL function for getting the hostname.<br />
<br />
File: `pg_config --libdir`/pg_gethostname.c<br />
<br />
To compile... (make sure pg_config is in your PATH)<br />
gcc -I$(pg_config --includedir-server) -fpic -c pg_gethostname.c -L$(pg_config --libdir)<br />
gcc --shared -o pg_gethostname.so pg_gethostname.o<br />
<br />
To create the funciton in PostgreSQL...<br />
CREATE OR REPLACE FUNCTION pg_gethostname() RETURNS text<br />
AS 'pg_gethostname'<br />
LANGUAGE C IMMUTABLE STRICT;<br />
<br />
*/<br />
#include "postgres.h"<br />
#include <limits.h><br />
#include <unistd.h><br />
#include <string.h><br />
#include "fmgr.h"<br />
<br />
#include "utils/palloc.h"<br />
#include "utils/elog.h"<br />
#include "storage/bufpage.h"<br />
<br />
#ifdef PG_MODULE_MAGIC<br />
PG_MODULE_MAGIC;<br />
#endif<br />
<br />
<br />
PG_FUNCTION_INFO_V1( pg_gethostname );<br />
<br />
Datum pg_gethostname( PG_FUNCTION_ARGS );<br />
<br />
Datum pg_gethostname( PG_FUNCTION_ARGS )<br />
{<br />
text *t;<br />
char server_hostname[HOST_NAME_MAX];<br />
size_t length;<br />
<br />
if ( gethostname( server_hostname, HOST_NAME_MAX ) != 0 )<br />
{<br />
// returning an empty string for the hostname if it fails makes<br />
// sense because if it succeeded, we would have a name<br />
server_hostname[0] = '\0';<br />
}<br />
<br />
length = strnlen( server_hostname, HOST_NAME_MAX );<br />
t = (text *) palloc(VARHDRSZ + length );<br />
SET_VARSIZE( t, VARHDRSZ + length );<br />
memcpy( VARDATA(t), server_hostname, length );<br />
<br />
PG_RETURN_TEXT_P( t );<br />
}<br />
<br />
</source><br />
<br />
Usage:<br />
<br />
postgres# select pg_gethostname();<br />
pg_gethostname <br />
----------------<br />
myserver<br />
<br />
<br />
[[Category:C]]</div>Brickhttps://wiki.postgresql.org/index.php?title=PgBouncer&diff=12900PgBouncer2011-01-05T17:32:24Z<p>Brick: /* Downloads */</p>
<hr />
<div>== Project Overview ==<br />
<br />
PgBouncer is a lightweight connection pooler for PostgreSQL.<br />
<br />
== Project Status ==<br />
<br />
Stable, in production.<br />
<br />
== Features ==<br />
<br />
* Several levels of brutality when rotating connections:<br />
*;Session pooling<br />
*:Most polite method. When client connects, a server connection will be assigned to it for the whole duration it stays connected. When client disconnects, the server connection will be put back into pool. This mode supports all PostgeSQL features.<br />
*;Transaction pooling<br />
*:Server connection is assigned to client only during a transaction. When PgBouncer notices that transaction is over, the server will be put back into pool.<br />
*:This mode breaks few session-based features of PostgreSQL. You can use it only when application cooperates by not using features that break. See the table below for incompatible features.<br />
*;Statement pooling<br />
*:Most aggressive method. This is transaction pooling with a twist - multi-statement transactions are disallowed. This is meant to enforce "autocommit" mode on client, mostly targeted for PL/Proxy. <br />
* Low memory requirements (2k per connection by default). This is due to the fact that PgBouncer does not need to see full packet at once.<br />
* It is not tied to one backend server, the destination databases can reside on different hosts.<br />
* Supports online reconfiguration for most of the settings.<br />
* Supports online restart/upgrade without dropping client connections.<br />
* Supports protocol V3 only, so backend version must be >= 7.4. <br />
<br />
== Documentation ==<br />
<br />
* [http://pgbouncer.projects.postgresql.org/doc/usage.html Command line usage].<br />
* [http://pgbouncer.projects.postgresql.org/doc/config.html Config file documentation].<br />
* [http://pgbouncer.projects.postgresql.org/doc/faq.html FAQ]: Load-balancing, SSL, prepared statements, online restart.<br />
<br />
== Downloads ==<br />
<br />
* Official releases, CVS: http://pgfoundry.org/projects/pgbouncer<br />
<br />
'''Binary packages (may not be up-to-date)'''<br />
<br />
* Deb: [http://www.debian.org Debian] and [http://www.ubuntu.com Ubuntu] include pgbouncer in their official repositories.<br />
* RPM: http://www.pgrpms.org/ (http://www.pgsqlrpms.org -- deprecated)<br />
<br />
== Community support ==<br />
<br />
* Mailing List: http://lists.pgfoundry.org/mailman/listinfo/pgbouncer-general<br />
* Bugtracker: http://pgfoundry.org/projects/pgbouncer/<br />
<br />
== Feature matrix for pooling modes ==<br />
<br />
Following table list various PostgreSQL features and whether they are compatible with PgBouncer pooling modes. Note that 'transaction' pooling breaks client expectations of server '''by design''' and can be used only if application cooperates by not using non-working features.<br />
<br />
{| cellpadding="5" cellspacing="0" border="1"<br />
|+ Supported PostgreSQL features.<br />
|-<br />
!Feature<br />
!Session pooling [1]<br />
!Transaction pooling<br />
|-<br />
|Startup parameters [2] || Yes || Yes<br />
|-<br />
| SET/RESET || Yes || No<br />
|-<br />
| LISTEN/NOTIFY || Yes || No<br />
|-<br />
| WITHOUT HOLD CURSOR || Yes || Yes<br />
|-<br />
| WITH HOLD CURSOR || Yes || No<br />
|-<br />
| Protocol-level prepared plans || Yes || No [3]<br />
|-<br />
| PREPARE / DEALLOCATE || Yes || No<br />
|-<br />
| ON COMMIT DROP temp tables || Yes || Yes<br />
|-<br />
| PRESERVE/DELETE ROWS temp tables || Yes || No<br />
|-<br />
| Cached plan reset || Yes || Yes [1]<br />
|-<br />
| LOAD statement || Yes || No<br />
|}<br />
<br />
* [1] - Full transparency requires PostgreSQL 8.3 and PgBouncer 1.1+ with server_reset_query = DISCARD ALL<br />
* [2] - Startup parameters are: client_encoding, datestyle, timezone and standard_conforming_strings. PgBouncer can detect their changes so it can guarantee they remain consistent for client. Available from PgBouncer 1.1.<br />
* [3] - It is possible to add support for that into PgBouncer.<br />
<br />
[[Category:Administration]]<br />
[[Category:Performance]]</div>Brickhttps://wiki.postgresql.org/index.php?title=GeoIP_index&diff=12783GeoIP index2010-12-08T16:49:43Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Spatial index on GeoIP data to speed up fetches|lang=SQL|category=Library|depends=Nothing}}<br />
<br />
<br />
If your are using PostgreSQL + ip2location (or MaxMind), creating the following spatial index on the ip2location table results in a huge performance increase.<br />
<br />
Create a spatial index on the ip_from/ip_to attributes:<br />
<br />
-- Taken from http://www.pgsql.cz/index.php/PostgreSQL_SQL_Tricks#Fast_interval_.28of_time_or_ip_addresses.29_searching_with_spatial_indexes<br />
<br />
<source lang="SQL"><br />
CREATE INDEX ip2location_range_gist ON ip2location USING gist ((box(point(ip_from,ip_from),point(ip_to,ip_to))) box_ops);<br />
<br />
analyze verbose ip2location;<br />
<br />
explain analyze<br />
select *<br />
from ip2location<br />
where box(point(ip_from,ip_from),point(ip_to,ip_to)) @> box(point (3512069689,3512069689), point(3512069689,3512069689));<br />
</source><br />
<br />
Query version where the IP needs to be converted to bigint:<br />
<br />
<source lang="SQL"><br />
explain analyze<br />
select *<br />
from ip2location<br />
where box(point(ip_from,ip_from),point(ip_to,ip_to)) @> box(point (inet_to_bigint('209.85.238.57'),inet_to_bigint('209.85.238.57')), point(inet_to_bigint('209.85.238.57'),inet_to_bigint('209.85.238.57')));<br />
</source><br />
<br />
The inet_to_bigint functions come from that same page:<br />
<br />
<source lang="SQL"><br />
-- from http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks#Bigint_to_ip<br />
drop function if exists bigint_to_inet(bigint);<br />
create function bigint_to_inet(bigint) returns inet as $$<br />
select (($1>>24&255)||'.'||($1>>16&255)||'.'||($1>>8&255)||'.'||($1>>0&255))::inet<br />
$$ language sql strict immutable;<br />
grant execute on function bigint_to_inet(bigint) to public;<br />
<br />
drop function if exists inet_to_bigint(inet);<br />
CREATE OR REPLACE FUNCTION inet_to_bigint(inet) RETURNS bigint AS $$<br />
SELECT $1 - inet '0.0.0.0'<br />
$$ LANGUAGE SQL strict immutable;<br />
grant execute on function inet_to_bigint(inet) to public;<br />
</source><br />
<br />
''added by bricklen''</div>Brickhttps://wiki.postgresql.org/index.php?title=GeoIP_index&diff=12782GeoIP index2010-12-08T16:47:43Z<p>Brick: New page: {{SnippetInfo|Spatial index on GeoIP data to speed up fetches|lang=SQL|category=Library|depends=Nothing}} If your are using PostgreSQL + ip2location (or MaxMind), creating the following ...</p>
<hr />
<div>{{SnippetInfo|Spatial index on GeoIP data to speed up fetches|lang=SQL|category=Library|depends=Nothing}}<br />
<br />
<br />
If your are using PostgreSQL + ip2location (or MaxMind), creating the following spatial index on the ip2location table results in a huge performance increase.<br />
<br />
Create a spatial index on the ip_from/ip_to attributes:<br />
<br />
-- Taken from http://www.pgsql.cz/index.php/PostgreSQL_SQL_Tricks#Fast_interval_.28of_time_or_ip_addresses.29_searching_with_spatial_indexes<br />
<br />
<source lang="SQL"><br />
CREATE INDEX ip2location_range_gist ON ip2location USING gist ((box(point(ip_from,ip_from),point(ip_to,ip_to))) box_ops);<br />
<br />
analyze verbose ip2location;<br />
<br />
explain analyze<br />
select *<br />
from ip2location<br />
where box(point(ip_from,ip_from),point(ip_to,ip_to)) @> box(point (3512069689,3512069689), point(3512069689,3512069689));<br />
</source><br />
<br />
Query version where the IP needs to be converted to bigint:<br />
<br />
<source lang="SQL"><br />
explain analyze<br />
select *<br />
from ip2location<br />
where box(point(ip_from,ip_from),point(ip_to,ip_to)) @> box(point (inet_to_bigint('209.85.238.57'),inet_to_bigint('209.85.238.57')), point(inet_to_bigint('209.85.238.57'),inet_to_bigint('209.85.238.57')));<br />
</source><br />
<br />
The inet_to_bigint functions come from that same page:<br />
<br />
<source lang="SQL"><br />
-- from http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks#Bigint_to_ip<br />
drop function if exists bigint_to_inet(bigint);<br />
create function bigint_to_inet(bigint) returns inet as $$<br />
select (($1>>24&255)||'.'||($1>>16&255)||'.'||($1>>8&255)||'.'||($1>>0&255))::inet<br />
$$ language sql strict immutable;<br />
grant execute on function bigint_to_inet(bigint) to public;<br />
<br />
drop function if exists inet_to_bigint(inet);<br />
CREATE OR REPLACE FUNCTION inet_to_bigint(inet) RETURNS bigint AS $$<br />
SELECT $1 - inet '0.0.0.0'<br />
$$ LANGUAGE SQL strict immutable;<br />
grant execute on function inet_to_bigint(inet) to public;<br />
</source></div>Brickhttps://wiki.postgresql.org/index.php?title=Search_public_functions&diff=12678Search public functions2010-12-02T22:28:26Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Search user-created functions|version=8.4|lang=PL/pgSQL|category=Library}}<br />
<br />
<br />
Search all functions in your PostgreSQL db for any matching terms.<br />
<br />
The input will be used as the regular expresson in the regexp_matches() function.<br />
<br />
Eg. <br />
''to find 7 - 10 consecutive digits in any function:''<br />
<br />
select function_name,matching_terms from search_public_functions('[0-9]{7,10}',true);<br />
<br />
''string example at the bottom of page''<br />
<br />
<source lang="plsql"><br />
-- load into db as superuser<br />
create or replace function search_public_functions(p_search_strings TEXT, p_case_insensitive BOOLEAN, OUT function_name TEXT, OUT matching_terms TEXT) RETURNS SETOF RECORD AS<br />
$body$<br />
declare<br />
x RECORD;<br />
qry TEXT;<br />
v_match BOOLEAN := 'false';<br />
v_matches TEXT;<br />
v_search_strings TEXT := p_search_strings;<br />
v_case_insensitive BOOLEAN := p_case_insensitive;<br />
v_funcdef TEXT;<br />
begin<br />
/* v_search_strings is a list, pipe-separated, exactly what we want to search against.<br />
NOTE: works on postgresql v8.4<br />
example:<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
*/<br />
<br />
if (v_case_insensitive IS NOT FALSE) then<br />
v_case_insensitive := TRUE;<br />
end if;<br />
<br />
qry := 'SELECT n.nspname||''.''||p.proname||'' (''||pg_catalog.pg_get_function_arguments(p.oid)||'')''::TEXT as funcname,<br />
(select pg_catalog.pg_get_functiondef(p.oid)) as funcdef,<br />
p.oid as funcoid<br />
FROM pg_catalog.pg_proc p<br />
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace<br />
WHERE pg_catalog.pg_function_is_visible(p.oid)<br />
AND n.nspname <> ''pg_catalog''<br />
AND n.nspname <> ''information_schema''<br />
AND NOT p.proisagg<br />
ORDER BY 1';<br />
<br />
if (p_case_insensitive IS TRUE) then<br />
v_search_strings := LOWER(v_search_strings);<br />
end if;<br />
<br />
for x in execute qry loop<br />
v_match := 'false';<br />
function_name := null;<br />
v_funcdef := null;<br />
<br />
select into v_match x.funcdef ~* v_search_strings;<br />
<br />
if ( v_match IS TRUE ) then<br />
v_matches := null;<br />
v_funcdef := x.funcdef;<br />
if (p_case_insensitive IS TRUE) then<br />
v_funcdef := LOWER(v_funcdef);<br />
end if;<br />
select array_to_string(array_agg(val),',') into v_matches from (select distinct array_to_string(regexp_matches(v_funcdef, v_search_strings ,'g'),',') as val) as y2;<br />
<br />
function_name := x.funcname;<br />
matching_terms := v_matches;<br />
RETURN NEXT;<br />
end if;<br />
end loop;<br />
end;<br />
$body$ language plpgsql SECURITY DEFINER;<br />
</source><br />
<br />
It can be called like so:<br />
<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
<br />
<br />
function_name | matching_terms <br />
-------------------------------------------------------+----------------<br />
public.array_intersect (anyarray, anyarray) | intersect<br />
public.cant_delete_error () | except<br />
public.crosstab2 (text) | crosstab<br />
public.crosstab3 (text) | crosstab<br />
public.crosstab4 (text) | crosstab<br />
public.crosstab (text) | crosstab<br />
public.crosstab (text, integer) | crosstab<br />
public.crosstab (text, text) | crosstab<br />
public.find_bad_block (p_tablename text) | ctid,except<br />
<br />
<br />
''original code (and any errors) by bricklen'' <br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Distance_in_km&diff=12677Distance in km2010-12-02T22:25:56Z<p>Brick: New page: {{SnippetInfo|Distance between 2 points|lang=PL/Perlu|category=Library}} To get the distance between two points (lat1, long1, lat2, long2), and accounting for the curvature of the Earth. ...</p>
<hr />
<div>{{SnippetInfo|Distance between 2 points|lang=PL/Perlu|category=Library}}<br />
<br />
To get the distance between two points (lat1, long1, lat2, long2), and accounting for the curvature of the Earth. Note, the plperlu function is not exact, but should be pretty close.<br />
<br />
''This example uses the contrib modules "Earthdistance" and "Cube" and yields virtually the same results as the Math::Trig function below''<br />
select ROUND( ('(33.0, 97.1)'::point <@> '(24.0, 88.6)')::NUMERIC * 1.609344 ) as km;<br />
<br />
''If you are unable to/do not want to install earthdistance and cube contrib modules, you can write a plperlu function to get simiar results''<br />
<br />
<source lang="perl"><br />
create or replace function distance_in_km (numeric,numeric,numeric,numeric) returns numeric as $body$<br />
use strict;<br />
use Math::Trig qw(great_circle_distance deg2rad);<br />
# from http://perldoc.perl.org/Math/Trig.html<br />
my $lat1 = shift(@_);<br />
my $lon1 = shift(@_);<br />
my $lat2 = shift(@_);<br />
my $lon2 = shift(@_);<br />
<br />
#elog WARNING, "$lat1, $lon1, $lat2, $lon2\n";<br />
<br />
# Notice the 90 - latitude: phi zero is at the North Pole.<br />
sub NESW { deg2rad($_[0]), deg2rad(90 - $_[1]) }<br />
my @L = NESW( $lat1, $lon1 );<br />
my @T = NESW( $lat2, $lon2 );<br />
my $km = great_circle_distance(@L, @T, 6378);<br />
<br />
return $km;<br />
$body$ language plperlu strict security definer;<br />
</source><br />
<br />
select ROUND(km) as km from distance_in_km(33.0,97.1,24.0,88.6) as km;<br />
km <br />
-----<br />
945<br />
<br />
<br />
''Written by bricklen, feel free to correct or enhance!''<br />
<br />
[[Category:PL/Perl]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Search_public_functions&diff=12596Search public functions2010-11-29T00:28:12Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Search user-created functions|version=8.4|lang=PL/pgSQL|category=Library}}<br />
<br />
<br />
Search all functions in your PostgreSQL db for any matching terms. The input will be used as the regular expresson in the regexp_matches() function.<br />
Eg. <br />
''to find 7 - 10 consecutive digits in any function:''<br />
select function_name,matching_terms from search_public_functions('[0-9]{7,10}',true);<br />
<br />
''string example at the bottom of page''<br />
<br />
<source lang="plsql"><br />
-- load into db as superuser<br />
create or replace function search_public_functions(p_search_strings TEXT, p_case_insensitive BOOLEAN, OUT function_name TEXT, OUT matching_terms TEXT) RETURNS SETOF RECORD AS<br />
$body$<br />
declare<br />
x RECORD;<br />
qry TEXT;<br />
v_match BOOLEAN := 'false';<br />
v_matches TEXT;<br />
v_search_strings TEXT := p_search_strings;<br />
v_case_insensitive BOOLEAN := p_case_insensitive;<br />
v_funcdef TEXT;<br />
begin<br />
/* v_search_strings is a list, pipe-separated, exactly what we want to search against.<br />
NOTE: works on postgresql v8.4<br />
example:<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
*/<br />
<br />
if (v_case_insensitive IS NOT FALSE) then<br />
v_case_insensitive := TRUE;<br />
end if;<br />
<br />
qry := 'SELECT n.nspname||''.''||p.proname||'' (''||pg_catalog.pg_get_function_arguments(p.oid)||'')''::TEXT as funcname,<br />
(select pg_catalog.pg_get_functiondef(p.oid)) as funcdef,<br />
p.oid as funcoid<br />
FROM pg_catalog.pg_proc p<br />
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace<br />
WHERE pg_catalog.pg_function_is_visible(p.oid)<br />
AND n.nspname <> ''pg_catalog''<br />
AND n.nspname <> ''information_schema''<br />
AND NOT p.proisagg<br />
ORDER BY 1';<br />
<br />
if (p_case_insensitive IS TRUE) then<br />
v_search_strings := LOWER(v_search_strings);<br />
end if;<br />
<br />
for x in execute qry loop<br />
v_match := 'false';<br />
function_name := null;<br />
v_funcdef := null;<br />
<br />
select into v_match x.funcdef ~* v_search_strings;<br />
<br />
if ( v_match IS TRUE ) then<br />
v_matches := null;<br />
v_funcdef := x.funcdef;<br />
if (p_case_insensitive IS TRUE) then<br />
v_funcdef := LOWER(v_funcdef);<br />
end if;<br />
select array_to_string(array_agg(val),',') into v_matches from (select distinct array_to_string(regexp_matches(v_funcdef, v_search_strings ,'g'),',') as val) as y2;<br />
<br />
function_name := x.funcname;<br />
matching_terms := v_matches;<br />
RETURN NEXT;<br />
end if;<br />
end loop;<br />
end;<br />
$body$ language plpgsql SECURITY DEFINER;<br />
</source><br />
<br />
It can be called like so:<br />
<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
<br />
<br />
function_name | matching_terms <br />
-------------------------------------------------------+----------------<br />
public.array_intersect (anyarray, anyarray) | intersect<br />
public.cant_delete_error () | except<br />
public.crosstab2 (text) | crosstab<br />
public.crosstab3 (text) | crosstab<br />
public.crosstab4 (text) | crosstab<br />
public.crosstab (text) | crosstab<br />
public.crosstab (text, integer) | crosstab<br />
public.crosstab (text, text) | crosstab<br />
public.find_bad_block (p_tablename text) | ctid,except<br />
<br />
<br />
''original code (and any errors) by bricklen'' <br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Audit_trigger&diff=12595Audit trigger2010-11-29T00:14:06Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Generic audit trigger function |version=8.4+|lang=PL/pgSQL|category=Library}}<br />
<br />
Here is an example of a generic trigger function used for auditing changes to tables, and optionally updating a "row_last_updated" column with the UTC timestamp of the change. <br />
<br />
The "last updated" timestamp uses "now()", as it is current at the *start* of the transaction, so all actions that occur with each call of the "audit.if_modified_func" will have the same timestamp.<br />
<br />
<source lang="plsql"><br />
-- create a schema named "audit"<br />
create schema audit;<br />
<br />
-- UTC is used as the audit time because if the calling applications are in other time zones, it simplifies things if the source data is in a common time zone.<br />
<br />
create table audit.logged_actions (<br />
schema_name text not null,<br />
table_name text not null,<br />
user_name text,<br />
action_tstamp_utc timestamp not null default (now() at time zone 'UTC'),<br />
action TEXT NOT NULL check (action in ('I','D','U')),<br />
original_data text,<br />
new_data text,<br />
query text<br />
) with (fillfactor=100);<br />
create index logged_actions_schema_table_idx on audit.logged_actions(((schema_name||'.'||table_name)::TEXT));<br />
create index logged_actions_action_tstamp_utc_idx on audit.logged_actions(action_tstamp_utc);<br />
create index logged_actions_action_idx on audit.logged_actions(action);<br />
<br />
<br />
-- generic function for all tables<br />
--drop function if exists audit.if_modified_func() cascade;<br />
CREATE OR REPLACE FUNCTION audit.if_modified_func() RETURNS trigger AS $body$<br />
DECLARE<br />
v_old_data TEXT;<br />
v_new_data TEXT;<br />
BEGIN<br />
/* If this actually for real auditing (where you need to log EVERY action),<br />
then you would need to use something like dblink or plperl that could log outside the transaction,<br />
regardless of whether the transaction committed or rolled back.<br />
*/<br />
<br />
/* This dance with casting the NEW and OLD values to a ROW is not necessary in pg 9.0+ */<br />
<br />
if (TG_OP = 'UPDATE') then<br />
v_old_data := ROW(OLD.*);<br />
v_new_data := ROW(NEW.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,original_data,new_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_old_data,v_new_data, current_query());<br />
-- Alternative version, just log the query, not the contents of the columns<br />
--insert into audit.logged_actions (schema_name,table_name,user_name,action,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),current_query());<br />
<br />
-- to update the row_last_updated column if implemented<br />
NEW.row_last_updated = timezone('UTC'::text, now());<br />
RETURN NEW;<br />
elsif (TG_OP = 'DELETE') then<br />
v_old_data := ROW(OLD.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,original_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_old_data, current_query());<br />
RETURN NULL;<br />
elsif (TG_OP = 'INSERT') then<br />
v_new_data := ROW(NEW.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,new_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_new_data, current_query());<br />
-- no need to set the "row_last_updated" timestamp, as that is handled using the "DEFAULT" option on in the table DDL<br />
RETURN NEW;<br />
else<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - Other action occurred: %, at %',TG_OP,now();<br />
RETURN NULL;<br />
end if;<br />
<br />
EXCEPTION<br />
WHEN data_exception THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [DATA EXCEPTION] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
WHEN unique_violation THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [UNIQUE] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
WHEN others THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [OTHER] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
END;<br />
$body$ language plpgsql security definer;<br />
</source><br />
<br />
<br />
-- Tested with a table named "t"<br />
drop table if exists t;<br />
create table t (x int not null primary key, y text);<br />
alter table t add row_last_updated timestamp not null default (now() at time zone 'UTC');<br />
<br />
-- needs to be applied to all tables that we want to monitor<br />
<br />
-- this is a test trigger to show how we can audit all changes to the relevant tables, including inserts<br />
create trigger t_if_modified_trg before insert or update or delete on t for each row execute procedure audit.if_modified_func();<br />
<br />
<br />
-- Some sample updates, deletes, and inserts to illustrate the points<br />
select * from t; select * from audit.logged_actions;<br />
<br />
insert into t (x,y) values (1,'asdf'),(2,'werwer'),(3,null);<br />
select * from t; select * from audit.logged_actions;<br />
<br />
update t set y='eeeeee' where x=2;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
update t set y='yuyuyuy' where x=3;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
delete from t where x=1;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
-- should be a pk violation<br />
update t set x=4 where x=2;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
<br />
''Original code by bricklen. Any errors are mine''<br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Search_public_functions&diff=12594Search public functions2010-11-29T00:09:14Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Search user-created functions|version=8.4|lang=PL/pgSQL|category=Library}}<br />
<br />
<br />
Search all functions in your PostgreSQL db for any matching terms. The input will be used as the regular expresson in the regexp_matches() function.<br />
Eg. <br />
''to find 7 - 10 consecutive digits in any function:''<br />
select function_name,matching_terms from search_public_functions('[0-9]{7,10}',true);<br />
<br />
''string example at the bottom of page''<br />
<br />
<source lang="plsql"><br />
-- load into db as superuser<br />
create or replace function search_public_functions(p_search_strings TEXT, p_case_insensitive BOOLEAN, OUT function_name TEXT, OUT matching_terms TEXT) RETURNS SETOF RECORD AS<br />
$body$<br />
declare<br />
x RECORD;<br />
qry TEXT;<br />
v_match BOOLEAN := 'false';<br />
v_matches TEXT;<br />
v_search_strings TEXT := p_search_strings;<br />
v_case_insensitive BOOLEAN := p_case_insensitive;<br />
v_funcdef TEXT;<br />
begin<br />
/* v_search_strings is a list, pipe-separated, exactly what we want to search against.<br />
NOTE: works on postgresql v8.4<br />
example:<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
*/<br />
<br />
if (v_case_insensitive IS NOT FALSE) then<br />
v_case_insensitive := TRUE;<br />
end if;<br />
<br />
qry := 'SELECT n.nspname||''.''||p.proname||'' (''||pg_catalog.pg_get_function_arguments(p.oid)||'')''::TEXT as funcname,<br />
(select pg_catalog.pg_get_functiondef(p.oid)) as funcdef,<br />
p.oid as funcoid<br />
FROM pg_catalog.pg_proc p<br />
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace<br />
WHERE pg_catalog.pg_function_is_visible(p.oid)<br />
AND n.nspname <> ''pg_catalog''<br />
AND n.nspname <> ''information_schema''<br />
AND NOT p.proisagg<br />
ORDER BY 1';<br />
<br />
if (p_case_insensitive IS TRUE) then<br />
v_search_strings := LOWER(v_search_strings);<br />
end if;<br />
<br />
for x in execute qry loop<br />
v_match := 'false';<br />
function_name := null;<br />
v_funcdef := null;<br />
<br />
select into v_match x.funcdef ~* v_search_strings;<br />
<br />
if ( v_match IS TRUE ) then<br />
v_matches := null;<br />
v_funcdef := x.funcdef;<br />
if (p_case_insensitive IS TRUE) then<br />
v_funcdef := LOWER(v_funcdef);<br />
end if;<br />
select array_to_string(array_agg(val),',') into v_matches from (select distinct array_to_string(regexp_matches(v_funcdef, v_search_strings ,'g'),',') as val) as y2;<br />
<br />
function_name := x.funcname;<br />
matching_terms := v_matches;<br />
RETURN NEXT;<br />
end if;<br />
end loop;<br />
end;<br />
$body$ language plpgsql SECURITY DEFINER;<br />
</source><br />
<br />
It can be called like so:<br />
<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
<br />
<br />
function_name | matching_terms <br />
-------------------------------------------------------+----------------<br />
public.array_intersect (anyarray, anyarray) | intersect<br />
public.cant_delete_error () | except<br />
public.crosstab2 (text) | crosstab<br />
public.crosstab3 (text) | crosstab<br />
public.crosstab4 (text) | crosstab<br />
public.crosstab (text) | crosstab<br />
public.crosstab (text, integer) | crosstab<br />
public.crosstab (text, text) | crosstab<br />
public.find_bad_block (p_tablename text) | ctid,except<br />
<br />
<br />
''original code by bricklen'' <br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Audit_trigger&diff=12593Audit trigger2010-11-29T00:08:43Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Generic audit trigger function |version=8.4+|lang=PL/pgSQL|category=Library}}<br />
<br />
Here is an example of a generic trigger function used for auditing changes to tables, and optionally updating a "row_last_updated" column with the UTC timestamp of the change. <br />
<br />
The "last updated" timestamp uses "now()", as it is current at the *start* of the transaction, so all actions that occur with each call of the "audit.if_modified_func" will have the same timestamp.<br />
<br />
<source lang="plsql"><br />
-- create a schema named "audit"<br />
create schema audit;<br />
<br />
-- UTC is used as the audit time because if the calling applications are in other time zones, it simplifies things if the source data is in a common time zone.<br />
<br />
create table audit.logged_actions (<br />
schema_name text not null,<br />
table_name text not null,<br />
user_name text,<br />
action_tstamp_utc timestamp not null default (now() at time zone 'UTC'),<br />
action TEXT NOT NULL check (action in ('I','D','U')),<br />
original_data text,<br />
new_data text,<br />
query text<br />
) with (fillfactor=100);<br />
create index logged_actions_schema_table_idx on audit.logged_actions(((schema_name||'.'||table_name)::TEXT));<br />
create index logged_actions_action_tstamp_utc_idx on audit.logged_actions(action_tstamp_utc);<br />
create index logged_actions_action_idx on audit.logged_actions(action);<br />
<br />
<br />
-- generic function for all tables<br />
--drop function if exists audit.if_modified_func() cascade;<br />
CREATE OR REPLACE FUNCTION audit.if_modified_func() RETURNS trigger AS $body$<br />
DECLARE<br />
v_old_data TEXT;<br />
v_new_data TEXT;<br />
BEGIN<br />
/* If this actually for real auditing (where you need to log EVERY action),<br />
then you would need to use something like dblink or plperl that could log outside the transaction,<br />
regardless of whether the transaction committed or rolled back.<br />
*/<br />
<br />
/* This dance with casting the NEW and OLD values to a ROW is not necessary in pg 9.0+ */<br />
<br />
if (TG_OP = 'UPDATE') then<br />
v_old_data := ROW(OLD.*);<br />
v_new_data := ROW(NEW.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,original_data,new_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_old_data,v_new_data, current_query());<br />
-- Alternative version, just log the query, not the contents of the columns<br />
--insert into audit.logged_actions (schema_name,table_name,user_name,action,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),current_query());<br />
<br />
-- to update the row_last_updated column if implemented<br />
NEW.row_last_updated = timezone('UTC'::text, now());<br />
RETURN NEW;<br />
elsif (TG_OP = 'DELETE') then<br />
v_old_data := ROW(OLD.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,original_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_old_data, current_query());<br />
RETURN NULL;<br />
elsif (TG_OP = 'INSERT') then<br />
v_new_data := ROW(NEW.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,new_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_new_data, current_query());<br />
RETURN NEW;<br />
else<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - Other action occurred: %, at %',TG_OP,now();<br />
RETURN NULL;<br />
end if;<br />
<br />
EXCEPTION<br />
WHEN data_exception THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [DATA EXCEPTION] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
WHEN unique_violation THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [UNIQUE] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
WHEN others THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [OTHER] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
END;<br />
$body$ language plpgsql security definer;<br />
</source><br />
<br />
<br />
-- Tested with a table named "t"<br />
drop table if exists t;<br />
create table t (x int not null primary key, y text);<br />
alter table t add row_last_updated timestamp not null default (now() at time zone 'UTC');<br />
<br />
-- needs to be applied to all tables that we want to monitor<br />
<br />
-- this is a test trigger to show how we can audit all changes to the relevant tables, including inserts<br />
create trigger t_if_modified_trg before insert or update or delete on t for each row execute procedure audit.if_modified_func();<br />
<br />
<br />
-- Some sample updates, deletes, and inserts to illustrate the points<br />
select * from t; select * from audit.logged_actions;<br />
<br />
insert into t (x,y) values (1,'asdf'),(2,'werwer'),(3,null);<br />
select * from t; select * from audit.logged_actions;<br />
<br />
update t set y='eeeeee' where x=2;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
update t set y='yuyuyuy' where x=3;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
delete from t where x=1;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
-- should be a pk violation<br />
update t set x=4 where x=2;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
<br />
Written by bricklen. All errors are mine :)<br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Audit_trigger&diff=12592Audit trigger2010-11-28T23:58:53Z<p>Brick: New page: {{SnippetInfo|Generic audit trigger function |version=8.4+|lang=PL/pgSQL|category=Library}} Here is an example of a generic trigger function used for auditing changes to tables, and optio...</p>
<hr />
<div>{{SnippetInfo|Generic audit trigger function |version=8.4+|lang=PL/pgSQL|category=Library}}<br />
<br />
Here is an example of a generic trigger function used for auditing changes to tables, and optionally updating a "row_last_updated" column with the UTC timestamp of the change. <br />
<br />
The "last updated" timestamp uses "now()", as it is current at the *start* of the transaction, so all actions that occur with each call of the "audit.if_modified_func" will have the same timestamp.<br />
<br />
<source lang="plsql"><br />
-- create a schema named "audit"<br />
create schema audit;<br />
<br />
-- UTC is used as the audit time because if the calling applications are in other time zones, it simplifies things if the source data is in a common time zone.<br />
<br />
create table audit.logged_actions (<br />
schema_name text not null,<br />
table_name text not null,<br />
user_name text,<br />
action_tstamp_utc timestamp not null default (now() at time zone 'UTC'),<br />
action TEXT NOT NULL check (action in ('I','D','U')),<br />
original_data text,<br />
new_data text,<br />
query text<br />
) with (fillfactor=100);<br />
create index logged_actions_schema_table_idx on audit.logged_actions(((schema_name||'.'||table_name)::TEXT));<br />
create index logged_actions_action_tstamp_utc_idx on audit.logged_actions(action_tstamp_utc);<br />
create index logged_actions_action_idx on audit.logged_actions(action);<br />
<br />
<br />
-- generic function for all tables<br />
--drop function if exists audit.if_modified_func() cascade;<br />
CREATE OR REPLACE FUNCTION audit.if_modified_func() RETURNS trigger AS $body$<br />
DECLARE<br />
v_old_data TEXT;<br />
v_new_data TEXT;<br />
BEGIN<br />
/* If this actually for real auditing (where you need to log EVERY action),<br />
then you would need to use something like dblink or plperl that could log outside the transaction,<br />
regardless of whether the transaction committed or rolled back.<br />
*/<br />
<br />
/* This dance with casting the NEW and OLD values to a ROW is not necessary in pg 9.0+ */<br />
<br />
if (TG_OP = 'UPDATE') then<br />
v_old_data := ROW(OLD.*);<br />
v_new_data := ROW(NEW.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,original_data,new_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_old_data,v_new_data, current_query());<br />
-- Alternative version, just log the query, not the contents of the columns<br />
--insert into audit.logged_actions (schema_name,table_name,user_name,action,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),current_query());<br />
<br />
-- to update the row_last_updated column if implemented<br />
NEW.row_last_updated = timezone('UTC'::text, now());<br />
RETURN NEW;<br />
elsif (TG_OP = 'DELETE') then<br />
v_old_data := ROW(OLD.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,original_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_old_data, current_query());<br />
RETURN NULL;<br />
elsif (TG_OP = 'INSERT') then<br />
v_new_data := ROW(NEW.*);<br />
insert into audit.logged_actions (schema_name,table_name,user_name,action,new_data,query) values (TG_TABLE_SCHEMA::TEXT,TG_TABLE_NAME::TEXT,session_user::TEXT,substring(TG_OP,1,1),v_new_data, current_query());<br />
RETURN NEW;<br />
else<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - Other action occurred: %, at %',TG_OP,now();<br />
RETURN NULL;<br />
end if;<br />
<br />
EXCEPTION<br />
WHEN data_exception THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [DATA EXCEPTION] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
WHEN unique_violation THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [UNIQUE] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
WHEN others THEN<br />
RAISE WARNING '[AUDIT.IF_MODIFIED_FUNC] - UDF ERROR [OTHER] - SQLSTATE: %, SQLERRM: %',SQLSTATE,SQLERRM;<br />
RETURN NULL;<br />
END;<br />
$body$ language plpgsql security definer;<br />
</source><br />
<br />
<br />
-- Tested with a table named "t"<br />
drop table if exists t;<br />
create table t (x int not null primary key, y text);<br />
alter table t add row_last_updated timestamp not null default (now() at time zone 'UTC');<br />
<br />
-- needs to be applied to all tables that we want to monitor<br />
<br />
-- this is a test trigger to show how we can audit all changes to the relevant tables, including inserts<br />
create trigger t_if_modified_trg before insert or update or delete on t for each row execute procedure audit.if_modified_func();<br />
<br />
<br />
-- Some sample updates, deletes, and inserts to illustrate the points<br />
select * from t; select * from audit.logged_actions;<br />
<br />
insert into t (x,y) values (1,'asdf'),(2,'werwer'),(3,null);<br />
select * from t; select * from audit.logged_actions;<br />
<br />
update t set y='eeeeee' where x=2;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
update t set y='yuyuyuy' where x=3;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
delete from t where x=1;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
-- should be a pk violation<br />
update t set x=4 where x=2;<br />
select * from t; select * from audit.logged_actions;<br />
<br />
<br />
Written by brick. All errors are mine :)<br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Round_time&diff=11983Round time2010-09-20T14:40:35Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Rounds a timestamp to the nearest 5 minute mark|version=8.2|lang=PL/pgSQL|category=Library}}<br />
<br />
<br />
Round a timestamp to the nearest 5 minute mark.<br />
<br />
<source lang="plsql"><br />
CREATE OR REPLACE FUNCTION round_time(timestamp with time zone) <br />
RETURNS timestamp with time zone AS $$ <br />
SELECT date_trunc('hour', $1) + interval '5 min' * round(date_part('minute', $1) / 5.0) <br />
$$ LANGUAGE sql;<br />
</source> <br />
<br />
sample usage:<br />
<pre><br />
postgres=# select now(), round_time('2010-09-17 16:48');<br />
now | round_time <br />
-------------------------------+------------------------<br />
2010-09-19 08:36:31.701919+02 | 2010-09-17 16:50:00+02<br />
(1 row)<br />
<br />
postgres=# select now(), round_time('2010-09-17 16:58');<br />
now | round_time <br />
-------------------------------+------------------------<br />
2010-09-19 08:36:43.860858+02 | 2010-09-17 17:00:00+02<br />
(1 row)<br />
<br />
postgres=# select now(), round_time('2010-09-17 16:57');<br />
now | round_time <br />
-------------------------------+------------------------<br />
2010-09-19 08:36:53.273612+02 | 2010-09-17 16:55:00+02<br />
(1 row)<br />
<br />
postgres=# select now(), round_time('2010-09-17 23:58');<br />
now | round_time <br />
------------------------------+------------------------<br />
2010-09-19 08:37:09.41387+02 | 2010-09-18 00:00:00+02<br />
(1 row)<br />
</pre></div>Brickhttps://wiki.postgresql.org/index.php?title=Round_time&diff=11970Round time2010-09-17T23:53:53Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Rounds a timestamp to the nearest 5 minute mark|version=8.2|lang=PL/pgSQL|category=Library}}<br />
<br />
''original code by brick''<br />
<br />
Round a timestamp to the nearest 5 minute mark.<br />
If folks have better/more efficient ways of doing this, please update this page.<br />
(can make a function to handle timestamptz by substituting timestamptz for timestamp)<br />
<br />
<source lang="plsql"><br />
create or replace function round_time (p_datetime TIMESTAMP, OUT rounded_time TIMESTAMP) RETURNS TIMESTAMP AS<br />
$body$<br />
declare<br />
v_min TEXT;<br />
v_datetime TIMESTAMP(0);<br />
v_last_number INTEGER;<br />
begin<br />
-- makes an assumption that timestamp is in the following format yyyy-mm-dd hh24:mi<br />
v_datetime := to_char(p_datetime,'YYYY-MM-DD HH24:MI');<br />
v_min := to_char(p_datetime,'MI');<br />
v_last_number := substring(v_min,2,1)::INTEGER;<br />
<br />
if ( v_last_number >= 5 ) then<br />
rounded_time := (v_datetime + interval '5 minutes' - (v_last_number * interval '1 minute'))::TIMESTAMP(0);<br />
else<br />
rounded_time := (v_datetime - (v_last_number * interval '1 minute'))::TIMESTAMP(0);<br />
end if;<br />
<br />
end;<br />
$body$ language plpgsql strict immutable;<br />
</source><br />
<br />
Sample usage:<br />
<br />
select now(),round_time(now()::TIMESTAMP);<br />
<br />
now | round_time <br />
-------------------------------+---------------------<br />
2010-09-17 16:32:51.261802-07 | 2010-09-17 16:30:00</div>Brickhttps://wiki.postgresql.org/index.php?title=Search_public_functions&diff=11969Search public functions2010-09-17T23:39:33Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Search user-created functions|version=8.4|lang=PL/pgSQL|category=Library}}<br />
<br />
''original code by brick''<br />
<br />
Search all functions in your PostgreSQL db for any matching terms. The input will be used as the regular expresson in the regexp_matches() function.<br />
Eg. <br />
''to find 7 - 10 consecutive digits in any function:''<br />
select function_name,matching_terms from search_public_functions('[0-9]{7,10}',true);<br />
<br />
''string example at the bottom of page''<br />
<br />
<source lang="plsql"><br />
-- load into db as superuser<br />
create or replace function search_public_functions(p_search_strings TEXT, p_case_insensitive BOOLEAN, OUT function_name TEXT, OUT matching_terms TEXT) RETURNS SETOF RECORD AS<br />
$body$<br />
declare<br />
x RECORD;<br />
qry TEXT;<br />
v_match BOOLEAN := 'false';<br />
v_matches TEXT;<br />
v_search_strings TEXT := p_search_strings;<br />
v_case_insensitive BOOLEAN := p_case_insensitive;<br />
v_funcdef TEXT;<br />
begin<br />
/* v_search_strings is a list, pipe-separated, exactly what we want to search against.<br />
NOTE: works on postgresql v8.4<br />
example:<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
*/<br />
<br />
if (v_case_insensitive IS NOT FALSE) then<br />
v_case_insensitive := TRUE;<br />
end if;<br />
<br />
qry := 'SELECT n.nspname||''.''||p.proname||'' (''||pg_catalog.pg_get_function_arguments(p.oid)||'')''::TEXT as funcname,<br />
(select pg_catalog.pg_get_functiondef(p.oid)) as funcdef,<br />
p.oid as funcoid<br />
FROM pg_catalog.pg_proc p<br />
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace<br />
WHERE pg_catalog.pg_function_is_visible(p.oid)<br />
AND n.nspname <> ''pg_catalog''<br />
AND n.nspname <> ''information_schema''<br />
AND NOT p.proisagg<br />
ORDER BY 1';<br />
<br />
if (p_case_insensitive IS TRUE) then<br />
v_search_strings := LOWER(v_search_strings);<br />
end if;<br />
<br />
for x in execute qry loop<br />
v_match := 'false';<br />
function_name := null;<br />
v_funcdef := null;<br />
<br />
select into v_match x.funcdef ~* v_search_strings;<br />
<br />
if ( v_match IS TRUE ) then<br />
v_matches := null;<br />
v_funcdef := x.funcdef;<br />
if (p_case_insensitive IS TRUE) then<br />
v_funcdef := LOWER(v_funcdef);<br />
end if;<br />
select array_to_string(array_agg(val),',') into v_matches from (select distinct array_to_string(regexp_matches(v_funcdef, v_search_strings ,'g'),',') as val) as y2;<br />
<br />
function_name := x.funcname;<br />
matching_terms := v_matches;<br />
RETURN NEXT;<br />
end if;<br />
end loop;<br />
end;<br />
$body$ language plpgsql SECURITY DEFINER;<br />
</source><br />
<br />
It can be called like so:<br />
<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
<br />
<br />
function_name | matching_terms <br />
-------------------------------------------------------+----------------<br />
public.array_intersect (anyarray, anyarray) | intersect<br />
public.cant_delete_error () | except<br />
public.crosstab2 (text) | crosstab<br />
public.crosstab3 (text) | crosstab<br />
public.crosstab4 (text) | crosstab<br />
public.crosstab (text) | crosstab<br />
public.crosstab (text, integer) | crosstab<br />
public.crosstab (text, text) | crosstab<br />
public.find_bad_block (p_tablename text) | ctid,except<br />
<br />
<br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Round_time&diff=11968Round time2010-09-17T23:38:37Z<p>Brick: New page: {{SnippetInfo|Rounds a timestamp to the nearest 5 minute mark|version=8.2|lang=PL/pgSQL|category=Library}} ''original code by brick'' Round a timestamp to the nearest 5 minute mark. If f...</p>
<hr />
<div>{{SnippetInfo|Rounds a timestamp to the nearest 5 minute mark|version=8.2|lang=PL/pgSQL|category=Library}}<br />
<br />
''original code by brick''<br />
<br />
Round a timestamp to the nearest 5 minute mark.<br />
If folks have better/more efficient ways of doing this, please update this page.<br />
<br />
<source lang="plsql"><br />
create or replace function round_time (p_datetime TIMESTAMP, OUT rounded_time TIMESTAMP) RETURNS TIMESTAMP AS<br />
$body$<br />
declare<br />
v_min TEXT;<br />
v_datetime TIMESTAMP(0);<br />
v_last_number INTEGER;<br />
begin<br />
-- makes an assumption that timestamp is in the following format yyyy-mm-dd hh24:mi<br />
v_datetime := to_char(p_datetime,'YYYY-MM-DD HH24:MI');<br />
v_min := to_char(p_datetime,'MI');<br />
v_last_number := substring(v_min,2,1)::INTEGER;<br />
<br />
if ( v_last_number >= 5 ) then<br />
rounded_time := (v_datetime + interval '5 minutes' - (v_last_number * interval '1 minute'))::TIMESTAMP(0);<br />
else<br />
rounded_time := (v_datetime - (v_last_number * interval '1 minute'))::TIMESTAMP(0);<br />
end if;<br />
<br />
end;<br />
$body$ language plpgsql strict immutable;<br />
</source><br />
<br />
Sample usage:<br />
<br />
select now(),round_time(now()::TIMESTAMP);<br />
<br />
now | round_time <br />
-------------------------------+---------------------<br />
2010-09-17 16:32:51.261802-07 | 2010-09-17 16:30:00</div>Brickhttps://wiki.postgresql.org/index.php?title=Search_public_functions&diff=11966Search public functions2010-09-17T21:48:42Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Search user-created functions|version=8.4|lang=PL/pgSQL|category=Library}}<br />
<br />
''original code by bricklen''<br />
<br />
Search all functions in your PostgreSQL db for any matching terms. The input will be used as the regular expresson in the regexp_matches() function.<br />
Eg. <br />
''to find 7 - 10 consecutive digits in any function:''<br />
select function_name,matching_terms from search_public_functions('[0-9]{7,10}',true);<br />
<br />
''string example at the bottom of page''<br />
<br />
<source lang="plsql"><br />
-- load into db as superuser<br />
create or replace function search_public_functions(p_search_strings TEXT, p_case_insensitive BOOLEAN, OUT function_name TEXT, OUT matching_terms TEXT) RETURNS SETOF RECORD AS<br />
$body$<br />
declare<br />
x RECORD;<br />
qry TEXT;<br />
v_match BOOLEAN := 'false';<br />
v_matches TEXT;<br />
v_search_strings TEXT := p_search_strings;<br />
v_case_insensitive BOOLEAN := p_case_insensitive;<br />
v_funcdef TEXT;<br />
begin<br />
/* v_search_strings is a list, pipe-separated, exactly what we want to search against.<br />
NOTE: works on postgresql v8.4<br />
example:<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
*/<br />
<br />
if (v_case_insensitive IS NOT FALSE) then<br />
v_case_insensitive := TRUE;<br />
end if;<br />
<br />
qry := 'SELECT n.nspname||''.''||p.proname||'' (''||pg_catalog.pg_get_function_arguments(p.oid)||'')''::TEXT as funcname,<br />
(select pg_catalog.pg_get_functiondef(p.oid)) as funcdef,<br />
p.oid as funcoid<br />
FROM pg_catalog.pg_proc p<br />
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace<br />
WHERE pg_catalog.pg_function_is_visible(p.oid)<br />
AND n.nspname <> ''pg_catalog''<br />
AND n.nspname <> ''information_schema''<br />
AND NOT p.proisagg<br />
ORDER BY 1';<br />
<br />
if (p_case_insensitive IS TRUE) then<br />
v_search_strings := LOWER(v_search_strings);<br />
end if;<br />
<br />
for x in execute qry loop<br />
v_match := 'false';<br />
function_name := null;<br />
v_funcdef := null;<br />
<br />
select into v_match x.funcdef ~* v_search_strings;<br />
<br />
if ( v_match IS TRUE ) then<br />
v_matches := null;<br />
v_funcdef := x.funcdef;<br />
if (p_case_insensitive IS TRUE) then<br />
v_funcdef := LOWER(v_funcdef);<br />
end if;<br />
select array_to_string(array_agg(val),',') into v_matches from (select distinct array_to_string(regexp_matches(v_funcdef, v_search_strings ,'g'),',') as val) as y2;<br />
<br />
function_name := x.funcname;<br />
matching_terms := v_matches;<br />
RETURN NEXT;<br />
end if;<br />
end loop;<br />
end;<br />
$body$ language plpgsql SECURITY DEFINER;<br />
</source><br />
<br />
It can be called like so:<br />
<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
<br />
<br />
function_name | matching_terms <br />
-------------------------------------------------------+----------------<br />
public.array_intersect (anyarray, anyarray) | intersect<br />
public.cant_delete_error () | except<br />
public.crosstab2 (text) | crosstab<br />
public.crosstab3 (text) | crosstab<br />
public.crosstab4 (text) | crosstab<br />
public.crosstab (text) | crosstab<br />
public.crosstab (text, integer) | crosstab<br />
public.crosstab (text, text) | crosstab<br />
public.find_bad_block (p_tablename text) | ctid,except<br />
<br />
<br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Search_public_functions&diff=11965Search public functions2010-09-17T21:40:52Z<p>Brick: </p>
<hr />
<div>{{SnippetInfo|Search user-created functions|version=8.4|lang=PL/pgSQL|category=Library}}<br />
<br />
''original code by bricklen''<br />
<br />
<source lang="plsql"><br />
-- load into db as superuser<br />
create or replace function search_public_functions(p_search_strings TEXT, p_case_insensitive BOOLEAN, OUT function_name TEXT, OUT matching_terms TEXT) RETURNS SETOF RECORD AS<br />
$body$<br />
declare<br />
x RECORD;<br />
qry TEXT;<br />
v_match BOOLEAN := 'false';<br />
v_matches TEXT;<br />
v_search_strings TEXT := p_search_strings;<br />
v_case_insensitive BOOLEAN := p_case_insensitive;<br />
v_funcdef TEXT;<br />
begin<br />
/* v_search_strings is a list, pipe-separated, exactly what we want to search against.<br />
NOTE: works on postgresql v8.4<br />
example:<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
*/<br />
<br />
if (v_case_insensitive IS NOT FALSE) then<br />
v_case_insensitive := TRUE;<br />
end if;<br />
<br />
qry := 'SELECT n.nspname||''.''||p.proname||'' (''||pg_catalog.pg_get_function_arguments(p.oid)||'')''::TEXT as funcname,<br />
(select pg_catalog.pg_get_functiondef(p.oid)) as funcdef,<br />
p.oid as funcoid<br />
FROM pg_catalog.pg_proc p<br />
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace<br />
WHERE pg_catalog.pg_function_is_visible(p.oid)<br />
AND n.nspname <> ''pg_catalog''<br />
AND n.nspname <> ''information_schema''<br />
AND NOT p.proisagg<br />
ORDER BY 1';<br />
<br />
if (p_case_insensitive IS TRUE) then<br />
v_search_strings := LOWER(v_search_strings);<br />
end if;<br />
<br />
for x in execute qry loop<br />
v_match := 'false';<br />
function_name := null;<br />
v_funcdef := null;<br />
<br />
select into v_match x.funcdef ~* v_search_strings;<br />
<br />
if ( v_match IS TRUE ) then<br />
v_matches := null;<br />
v_funcdef := x.funcdef;<br />
if (p_case_insensitive IS TRUE) then<br />
v_funcdef := LOWER(v_funcdef);<br />
end if;<br />
select array_to_string(array_agg(val),',') into v_matches from (select distinct array_to_string(regexp_matches(v_funcdef, v_search_strings ,'g'),',') as val) as y2;<br />
<br />
function_name := x.funcname;<br />
matching_terms := v_matches;<br />
RETURN NEXT;<br />
end if;<br />
end loop;<br />
end;<br />
$body$ language plpgsql SECURITY DEFINER;<br />
</source><br />
<br />
It can be called like so:<br />
<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
<br />
<br />
function_name | matching_terms <br />
-------------------------------------------------------+----------------<br />
public.array_intersect (anyarray, anyarray) | intersect<br />
public.cant_delete_error () | except<br />
public.crosstab2 (text) | crosstab<br />
public.crosstab3 (text) | crosstab<br />
public.crosstab4 (text) | crosstab<br />
public.crosstab (text) | crosstab<br />
public.crosstab (text, integer) | crosstab<br />
public.crosstab (text, text) | crosstab<br />
public.find_bad_block (p_tablename text) | ctid,except<br />
<br />
<br />
<br />
[[Category:PL/pgSQL]]</div>Brickhttps://wiki.postgresql.org/index.php?title=Search_public_functions&diff=11962Search public functions2010-09-17T18:18:45Z<p>Brick: New page: {{SnippetInfo|search_public_functions|version=8.4|lang=PL/pgSQL|category=Library}} <source lang="plsql"> -- load as superuser create or replace function search_public_functions(p_search_s...</p>
<hr />
<div>{{SnippetInfo|search_public_functions|version=8.4|lang=PL/pgSQL|category=Library}}<br />
<br />
<source lang="plsql"><br />
-- load as superuser<br />
create or replace function search_public_functions(p_search_strings TEXT, p_case_insensitive BOOLEAN, OUT function_name TEXT, OUT matching_terms TEXT) RETURNS SETOF RECORD AS<br />
$body$<br />
declare<br />
x RECORD;<br />
qry TEXT;<br />
v_match BOOLEAN := 'false';<br />
v_matches TEXT;<br />
v_search_strings TEXT := p_search_strings;<br />
v_case_insensitive BOOLEAN := p_case_insensitive;<br />
v_funcdef TEXT;<br />
begin<br />
/* v_search_strings is a list, pipe-separated, exactly what we want to search against.<br />
NOTE: works on postgresql v8.4<br />
example:<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
*/<br />
<br />
if (v_case_insensitive IS NOT FALSE) then<br />
v_case_insensitive := TRUE;<br />
end if;<br />
<br />
qry := 'SELECT n.nspname||''.''||p.proname||'' (''||pg_catalog.pg_get_function_arguments(p.oid)||'')''::TEXT as funcname,<br />
(select pg_catalog.pg_get_functiondef(p.oid)) as funcdef,<br />
p.oid as funcoid<br />
FROM pg_catalog.pg_proc p<br />
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace<br />
WHERE pg_catalog.pg_function_is_visible(p.oid)<br />
AND n.nspname <> ''pg_catalog''<br />
AND n.nspname <> ''information_schema''<br />
AND NOT p.proisagg<br />
ORDER BY 1';<br />
<br />
if (p_case_insensitive IS TRUE) then<br />
v_search_strings := LOWER(v_search_strings);<br />
end if;<br />
<br />
for x in execute qry loop<br />
v_match := 'false';<br />
function_name := null;<br />
v_funcdef := null;<br />
<br />
select into v_match x.funcdef ~* v_search_strings;<br />
<br />
if ( v_match IS TRUE ) then<br />
v_matches := null;<br />
v_funcdef := x.funcdef;<br />
if (p_case_insensitive IS TRUE) then<br />
v_funcdef := LOWER(v_funcdef);<br />
end if;<br />
select array_to_string(array_agg(val),',') into v_matches from (select distinct array_to_string(regexp_matches(v_funcdef, v_search_strings ,'g'),',') as val) as y2;<br />
<br />
function_name := x.funcname;<br />
matching_terms := v_matches;<br />
RETURN NEXT;<br />
end if;<br />
end loop;<br />
end;<br />
$body$ language plpgsql SECURITY DEFINER;<br />
</source><br />
<br />
It can be called like so:<br />
<br />
select function_name,matching_terms from search_public_functions('crosstab|intersect|except|ctid',true);<br />
<br />
<br />
function_name | matching_terms <br />
-------------------------------------------------------+----------------<br />
public.array_intersect (anyarray, anyarray) | intersect<br />
public.cant_delete_error () | except<br />
public.crosstab2 (text) | crosstab<br />
public.crosstab3 (text) | crosstab<br />
public.crosstab4 (text) | crosstab<br />
public.crosstab (text) | crosstab<br />
public.crosstab (text, integer) | crosstab<br />
public.crosstab (text, text) | crosstab<br />
public.find_bad_block (p_tablename text) | ctid,except<br />
<br />
<br />
<br />
[[Category:PL/pgSQL]]</div>Brick