https://wiki.postgresql.org/api.php?action=feedcontributions&user=Alvherre&feedformat=atomPostgreSQL wiki - User contributions [en]2024-03-29T01:07:15ZUser contributionsMediaWiki 1.35.13https://wiki.postgresql.org/index.php?title=TABLESAMPLE_Implementation&diff=38695TABLESAMPLE Implementation2024-02-29T11:46:31Z<p>Alvherre: fix markup</p>
<hr />
<div>== Design page ==<br />
<br />
This wiki page is design discussion for a feature that did not, at time of writing, exist in PostgreSQL. This feature is now available since version 9.5 and its details can be found in the [https://www.postgresql.org/docs/current/static/tablesample-method.html official documentation].<br />
<br />
== Introduction ==<br />
TABLESAMPLE is an interesting sql clause. It is defined in SQL standard 2003. An example is<br/><br />
<code><br />
SELECT avg(salary)<br/><br />
FROM emp TABLESAMPLE SYSTEM (50);<br/><br />
</code><br />
It will return a sample of the underlying table of which the size depends on the number specified in the bracket. The detail of the specification of this query from SQL standard 2003 is described [[#Project Details|below]].<br/><br />
Microsoft SQL Server and DB2 have implemented this clause. Querying a sample of a table is often occurring in people’s work. An paper on elaborating the usage of sampling is on [http://www.almaden.ibm.com/cs/people/peterh/idugjbig.pdf a paper from IBM]. In page 1 and 2, the author describes the benefits and usage of a fast sampling method towards the discovering general trends and patterns in data. <br/><br />
It will be useful for PostgreSQL to implement this feature and make it available to the users.<br/><br />
This is currently done as a Google Summer of Code 2012 project --- [http://google-melange.appspot.com/gsoc/project/google/gsoc2012/hqinnus/13001 Implementing TableSample for Postgres].<br/><br />
<br />
== Project Details ==<br />
===About TABLESAMPLE Clause=== <br />
<br />
====Concepts====<br />
In a <code>&lt;table reference&gt;</code>, <code>&lt;sample clause&gt;</code> can be specified to return a subset of result rows depending on<br />
the <code>&lt;sample method&gt;</code> and <code>&lt;sample percentage&gt;</code>. If the <code>&lt;sample clause&gt;</code> contains <code>&lt;repeatable clause&gt;</code>,<br />
then repeated executions of that <code>&lt;table reference&gt;</code> return a result table with identical rows for a given<br />
<code>&lt;repeat argument&gt;</code>, provided certain implementation-defined conditions are satisfied.<br />
<br />
====Syntax====<br />
<syntaxhighlight lang="bnf"><br />
<table reference> ::= <table factor> | <joined table><br />
<table factor> ::= <table primary> [ <sample clause> ]<br />
<table primary> ::= <table or query name> [ [ AS ] <correlation name> ]<br />
<sample clause> ::= TABLESAMPLE <sample method> <left paren> <br />
<sample percentage> <right paren> [ <repeatable clause> ]<br />
<sample method> ::= BERNOULLI | SYSTEM<br />
<repeatable clause> ::= REPEATABLE <left paren> <repeat argument> <right paren><br />
<sample percentage> ::= <numeric value expression><br />
<repeat argument> ::= <numeric value expression><br />
</syntaxhighlight><br />
<br />
====General Rules====<br />
Let TP be the <code>&lt;table primary&gt;</code> immediately contained in a <code>&lt;table factor&gt;</code> TF. Let RT be the result of<br />
TP. Case:<br />
# If <code>&lt;sample clause&gt;</code> is specified, then:<br />
#:(a) Let N be the number of rows in RT and let S be the value of <code>&lt;sample percentage&gt;</code>.<br />
#:(b) If S is the null value or if S < 0 (zero) or if S > 100, then an exception condition is raised: “data exception — invalid sample size”.<br />
#:(c) If <code>&lt;repeatable clause&gt;</code> is specified, then let RPT be the value of <code>&lt;repeat argument&gt;</code>. If RPT is the null value, then an exception condition is raised: data exception — invalid repeat argument in a sample clause”.<br />
#:(d) Case:<br />
#::# If <code>&lt;sample method&gt;</code> specifies BERNOULLI, then the result of TF is a table containing approximately (N &#8727; S/100) rows of RT. The probability of a row of RT being included in result of TF is S/100. Further, whether a given row of RT is included in result of TF is independent of whether other rows of RT are included in result of TF.<br />
#::# Otherwise, result of TF is a table containing approximately (N &#8727; S/100) rows of RT. The probability of a row of RT being included in result of TF is S/100.<br />
#:(e) If TF contains outer references, then a table with identical rows is generated every time TF is evaluated with a given set of values for outer references.<br />
# Otherwise, result of TF is RT.<br />
<br />
- <code>sample method</code> is specified in two types: BERNOULLI and SYSTEM<br/><br />
- BERNOULLI implies picking tuples with a specified probability.<br/><br />
- SYSTEM implies picking pages with a specified probability.<br/><br />
<br />
=== Working With Tablesample===<br />
TABLESAMPLE is a query dealing with table sampling. Querying "select * from foo TABLESAMPLE SYSTEM (1)" is similiar to "select * from foo where random()<0.01". When you query tablesample, you have to specify the sampling method. Currently, there are two methods, SYSTEM and BERNOULLI, as they are ANSI SQL required. There might be other sampling methods adding into support, if it is of interest to some users and necessary. You can optionally specify the REPEATABLE option, which can give you the same sample in different runs. <br/><br />
The select query directly using TABLESAMPLE will use a scan node called SAMPLESCAN. If you use explain to see the query plan used by the postgres optimizer, you will find the "sample scan" which directly scan the sampled table. <br/><br />
Generally, TABLESAMPLE can be used only by SELECT query. You can also use it with any join query and aggregation.<br/><br />
TABLESAMPLE currently only works with sampling percentage, you can only specify an float (or expression returning float) and query will take a sample of that number percent. You cannot specify the number of rows in the query, just like what you can do in SQL Server. <br/><br />
<br />
==== SYSTEM Option ====<br />
TABLESAMPLE SYSTEM method returns an approximate percentage of rows. It generates a random number for each physical storage page for the underlying relation. Based on this random number and the sampling percentage specified, it either includes or exclude the corresponding storage page. If that page is included, the whole page will be returned in the result set. There are some side effects because of the fact that the sampling is done on block (page) level:<br/><br />
# The result set size will vary in different runs. The percentage of result set size to the total tuple size will be sometimes larger, sometimes smaller than the percentage specified. You might use "limit <number>" to get the top <number> of tuples.<br />
# If the underlying relation contains only one page, either the entire page or none tuple gets returned. <br/><br />
<br />
==== BERNOULLI Option ====<br />
TABLESAMPLE BERNOULLI method samples directly on each row of the underlying relation. This sampling method will actually scan the whole relation and randomly pick individual tuples (it basically does "coin flip" for each tuple). This algorithm gives better random distribution but will be slower for small percentages.<br/><br />
<br />
==== REPEATABLE Option ====<br />
In REPEATABLE clause, you can specify a random seed number. That number will be used to generate a seeding for the PRNG random generator in Postgres backend. In different runs, if the number is the same, the result set for those runs will be the same, as long as no change has been made to the table. If different numbers are specified, the result set generally will be different. The following actions to the table are considered changes: inserting, updating, deleting, index rebuilding, index defragmenting, restoring a database, and attaching a database.[http://msdn.microsoft.com/en-us/library/ms189108%28v=sql.105%29.aspx reference] <br/><br />
<br />
=== Examples ===<br />
* '''Selecting a sample'''<br />
<syntaxhighlight lang="sql"> <br />
Select * <br />
from foo TABLESAMPLE SYSTEM (10); --Returns about 10% of rows in foo using SYSTEM method<br />
Select *<br />
from foo TABLESAMPLE BERNOULLI (10); --Using BERNOULLI sampling method <br />
</syntaxhighlight><br />
<br />
* '''Deleting a percentage of rows'''<br />
<syntaxhighlight lang="sql"><br />
Delete<br />
from foo TABLESAMPLE SYSTEM (1); --Delete 1 percent of rows from foo<br />
</syntaxhighlight><br />
<br />
* '''Updating a percentage of rows'''<br />
<syntaxhighlight lang="sql"><br />
Update foo TABLESAMPLE SYSTEM (1)<br />
set col1='col1'; -- Update the attribute value of col1 in 1 percent of rows in foo to "col1"<br />
</syntaxhighlight><br />
<br />
* '''Selecting a sample with limit and order by'''<br />
<syntaxhighlight lang="sql"><br />
Select *<br />
from foo TABLESAMPLE SYSTEM (1)<br />
order by col1<br />
limit 100; -- Select 1 percent of rows from foo, display the first 100 rows, order by column col1<br />
</syntaxhighlight><br />
<br />
* '''Selecting with repeatable'''<br />
<syntaxhighlight lang="sql"><br />
Select *<br />
from foo TABLESAMPLE SYSTEM (1) REPEATABLE (200); <br />
Select *<br />
from foo TABLESAMPLE SYSTEM (1) REPEATABLE (200); --The result set is the same as above <br />
Select *<br />
from foo TABLESAMPLE SYSTEM (1) REPEATABLE (100); --The result set different from above<br />
</syntaxhighlight><br />
<br />
== References ==<br />
* [http://www.neilconway.org/talks/hacking/ottawa/sql_standard.pdf www.neilconway.org: The TABLESAMPLE Clause: Excerpts From SQL:2003 ]<br />
* [https://public.dhe.ibm.com/software/data/informix/pdfs/16db2-sampling.pdf Speeding up DB2 UDB Using Sampling: Peter J. Haas: IBM Almaden Research Center]<br />
<br />
[[Category:SQL Keyword]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=TABLESAMPLE_Implementation&diff=38694TABLESAMPLE Implementation2024-02-29T11:44:39Z<p>Alvherre: markup fix</p>
<hr />
<div>== Design page ==<br />
<br />
This wiki page is design discussion for a feature that did not, at time of writing, exist in PostgreSQL. This feature is now available since version 9.5 and its details can be found in the [https://www.postgresql.org/docs/current/static/tablesample-method.html official documentation].<br />
<br />
== Introduction ==<br />
TABLESAMPLE is an interesting sql clause. It is defined in SQL standard 2003. An example is<br/><br />
<code><br />
SELECT avg(salary)<br/><br />
FROM emp TABLESAMPLE SYSTEM (50);<br/><br />
</code><br />
It will return a sample of the underlying table of which the size depends on the number specified in the bracket. The detail of the specification of this query from SQL standard 2003 is described [[#Project Details|below]].<br/><br />
Microsoft SQL Server and DB2 have implemented this clause. Querying a sample of a table is often occurring in people’s work. An paper on elaborating the usage of sampling is on [http://www.almaden.ibm.com/cs/people/peterh/idugjbig.pdf a paper from IBM]. In page 1 and 2, the author describes the benefits and usage of a fast sampling method towards the discovering general trends and patterns in data. <br/><br />
It will be useful for PostgreSQL to implement this feature and make it available to the users.<br/><br />
This is currently done as a Google Summer of Code 2012 project --- [http://google-melange.appspot.com/gsoc/project/google/gsoc2012/hqinnus/13001 Implementing TableSample for Postgres].<br/><br />
<br />
== Project Details ==<br />
===About TABLESAMPLE Clause=== <br />
<br />
====Concepts====<br />
In a <code>&lt;table reference&gt;</code>, <code>&lt;sample clause&gt;</code> can be specified to return a subset of result rows depending on<br />
the <code>&lt;sample method&gt;</code> and <code>&lt;sample percentage&gt;</code>. If the <code>&lt;sample clause&gt;</code> contains <code>&lt;repeatable clause&gt;</code>,<br />
then repeated executions of that <code>&lt;table reference&gt;</code> return a result table with identical rows for a given<br />
<code>&lt;repeat argument&gt;</code>, provided certain implementation-defined conditions are satisfied.<br />
<br />
====Syntax====<br />
<syntaxhighlight lang="bnf" enclose="pre"><br />
<table reference> ::= <table factor> | <joined table><br />
<table factor> ::= <table primary> [ <sample clause> ]<br />
<table primary> ::= <table or query name> [ [ AS ] <correlation name> ]<br />
<sample clause> ::= TABLESAMPLE <sample method> <left paren> <br />
<sample percentage> <right paren> [ <repeatable clause> ]<br />
<sample method> ::= BERNOULLI | SYSTEM<br />
<repeatable clause> ::= REPEATABLE <left paren> <repeat argument> <right paren><br />
<sample percentage> ::= <numeric value expression><br />
<repeat argument> ::= <numeric value expression><br />
</syntaxhighlight><br />
<br />
====General Rules====<br />
Let TP be the <code>&lt;table primary&gt;</code> immediately contained in a <code>&lt;table factor&gt;</code> TF. Let RT be the result of<br />
TP. Case:<br />
# If <code>&lt;sample clause&gt;</code> is specified, then:<br />
#:(a) Let N be the number of rows in RT and let S be the value of <code>&lt;sample percentage&gt;</code>.<br />
#:(b) If S is the null value or if S < 0 (zero) or if S > 100, then an exception condition is raised: “data exception — invalid sample size”.<br />
#:(c) If <code>&lt;repeatable clause&gt;</code> is specified, then let RPT be the value of <code>&lt;repeat argument&gt;</code>. If RPT is the null value, then an exception condition is raised: data exception — invalid repeat argument in a sample clause”.<br />
#:(d) Case:<br />
#::# If <code>&lt;sample method&gt;</code> specifies BERNOULLI, then the result of TF is a table containing approximately (N &#8727; S/100) rows of RT. The probability of a row of RT being included in result of TF is S/100. Further, whether a given row of RT is included in result of TF is independent of whether other rows of RT are included in result of TF.<br />
#::# Otherwise, result of TF is a table containing approximately (N &#8727; S/100) rows of RT. The probability of a row of RT being included in result of TF is S/100.<br />
#:(e) If TF contains outer references, then a table with identical rows is generated every time TF is evaluated with a given set of values for outer references.<br />
# Otherwise, result of TF is RT.<br />
<br />
- <code>sample method</code> is specified in two types: BERNOULLI and SYSTEM<br/><br />
- BERNOULLI implies picking tuples with a specified probability.<br/><br />
- SYSTEM implies picking pages with a specified probability.<br/><br />
<br />
=== Working With Tablesample===<br />
TABLESAMPLE is a query dealing with table sampling. Querying "select * from foo TABLESAMPLE SYSTEM (1)" is similiar to "select * from foo where random()<0.01". When you query tablesample, you have to specify the sampling method. Currently, there are two methods, SYSTEM and BERNOULLI, as they are ANSI SQL required. There might be other sampling methods adding into support, if it is of interest to some users and necessary. You can optionally specify the REPEATABLE option, which can give you the same sample in different runs. <br/><br />
The select query directly using TABLESAMPLE will use a scan node called SAMPLESCAN. If you use explain to see the query plan used by the postgres optimizer, you will find the "sample scan" which directly scan the sampled table. <br/><br />
Generally, TABLESAMPLE can be used only by SELECT query. You can also use it with any join query and aggregation.<br/><br />
TABLESAMPLE currently only works with sampling percentage, you can only specify an float (or expression returning float) and query will take a sample of that number percent. You cannot specify the number of rows in the query, just like what you can do in SQL Server. <br/><br />
<br />
==== SYSTEM Option ====<br />
TABLESAMPLE SYSTEM method returns an approximate percentage of rows. It generates a random number for each physical storage page for the underlying relation. Based on this random number and the sampling percentage specified, it either includes or exclude the corresponding storage page. If that page is included, the whole page will be returned in the result set. There are some side effects because of the fact that the sampling is done on block (page) level:<br/><br />
# The result set size will vary in different runs. The percentage of result set size to the total tuple size will be sometimes larger, sometimes smaller than the percentage specified. You might use "limit <number>" to get the top <number> of tuples.<br />
# If the underlying relation contains only one page, either the entire page or none tuple gets returned. <br/><br />
<br />
==== BERNOULLI Option ====<br />
TABLESAMPLE BERNOULLI method samples directly on each row of the underlying relation. This sampling method will actually scan the whole relation and randomly pick individual tuples (it basically does "coin flip" for each tuple). This algorithm gives better random distribution but will be slower for small percentages.<br/><br />
<br />
==== REPEATABLE Option ====<br />
In REPEATABLE clause, you can specify a random seed number. That number will be used to generate a seeding for the PRNG random generator in Postgres backend. In different runs, if the number is the same, the result set for those runs will be the same, as long as no change has been made to the table. If different numbers are specified, the result set generally will be different. The following actions to the table are considered changes: inserting, updating, deleting, index rebuilding, index defragmenting, restoring a database, and attaching a database.[http://msdn.microsoft.com/en-us/library/ms189108%28v=sql.105%29.aspx reference] <br/><br />
<br />
=== Examples ===<br />
* '''Selecting a sample'''<br />
<syntaxhighlight lang="sql"> <br />
Select * <br />
from foo TABLESAMPLE SYSTEM (10); --Returns about 10% of rows in foo using SYSTEM method<br />
Select *<br />
from foo TABLESAMPLE BERNOULLI (10); --Using BERNOULLI sampling method <br />
</syntaxhighlight><br />
<br />
* '''Deleting a percentage of rows'''<br />
<syntaxhighlight lang="sql"><br />
Delete<br />
from foo TABLESAMPLE SYSTEM (1); --Delete 1 percent of rows from foo<br />
</syntaxhighlight><br />
<br />
* '''Updating a percentage of rows'''<br />
<syntaxhighlight lang="sql"><br />
Update foo TABLESAMPLE SYSTEM (1)<br />
set col1='col1'; -- Update the attribute value of col1 in 1 percent of rows in foo to "col1"<br />
</syntaxhighlight><br />
<br />
* '''Selecting a sample with limit and order by'''<br />
<syntaxhighlight lang="sql"><br />
Select *<br />
from foo TABLESAMPLE SYSTEM (1)<br />
order by col1<br />
limit 100; -- Select 1 percent of rows from foo, display the first 100 rows, order by column col1<br />
</syntaxhighlight><br />
<br />
* '''Selecting with repeatable'''<br />
<syntaxhighlight lang="sql"><br />
Select *<br />
from foo TABLESAMPLE SYSTEM (1) REPEATABLE (200); <br />
Select *<br />
from foo TABLESAMPLE SYSTEM (1) REPEATABLE (200); --The result set is the same as above <br />
Select *<br />
from foo TABLESAMPLE SYSTEM (1) REPEATABLE (100); --The result set different from above<br />
</syntaxhighlight><br />
<br />
== References ==<br />
* [http://www.neilconway.org/talks/hacking/ottawa/sql_standard.pdf www.neilconway.org: The TABLESAMPLE Clause: Excerpts From SQL:2003 ]<br />
* [https://public.dhe.ibm.com/software/data/informix/pdfs/16db2-sampling.pdf Speeding up DB2 UDB Using Sampling: Peter J. Haas: IBM Almaden Research Center]<br />
<br />
[[Category:SQL Keyword]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Aggregate_Random&diff=38693Aggregate Random2024-02-29T11:42:55Z<p>Alvherre: fix markup</p>
<hr />
<div>A random value is obtained by the [http://www.postgresql.org/docs/9.0/static/functions-math.html random()] buildin function, but sometimes, you want a random element from a grouping. <br />
<br />
== random() ==<br />
{{SnippetInfo2|Aggregate Random|lang=SQL}}<br />
<br />
This snippet allows you to use <tt>random()</tt> as an aggregate function. It is also part of the [[ulib_agg|ulib_agg user-defined library]].<br />
<br />
It should distribute the choices uniformly over each row in the grouping, whether the value selected is NULL or not. (you could modify SFUNC if you wanted to have it select a random non-NULL element)<br />
<br />
<syntaxhighlight lang="sql"><br />
CREATE OR REPLACE function _final_random(anyarray) <br />
RETURNS anyelement AS<br />
$BODY$<br />
SELECT $1[array_lower($1,1) + floor((1 + array_upper($1, 1) - array_lower($1, 1))*random())];<br />
$BODY$<br />
LANGUAGE 'sql' IMMUTABLE;<br />
<br />
CREATE AGGREGATE random(anyelement) (<br />
SFUNC=array_append, --Function to call for each row. Just builds the array<br />
STYPE=anyarray,<br />
FINALFUNC=_final_random, --Function to call after everything has been added to array<br />
INITCOND='{}' --Initialize an empty array when starting<br />
);<br />
</syntaxhighlight><br />
=== Usage ===<br />
<syntaxhighlight lang="sql">SELECT random(x) AS array_of_randoms_from1_toX FROM t;</syntaxhighlight><br />
<br />
[[Category:SQL]]<br />
[[Category:Snippets]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Aggregate_Mode&diff=38692Aggregate Mode2024-02-29T11:42:20Z<p>Alvherre: fix markup</p>
<hr />
<div>The [http://en.wikipedia.org/wiki/Mode_%28statistics%29 statistical mode] is the value that appears most often in a set of values. <br />
<!-- Sometime we need to find the value that occurs most often in a group. Then mathematical concept is mode. --> <br />
<br />
<br />
== Postgres 9.4 has built-in aggregate function mode()==<br />
The new ordered-set aggregate function supersedes the custom aggregate function below. More importantly, it also '''conflicts''' with it. To install it anyway, use a ''different name''. Functionality is (almost) identical, but the new built-in function is ''much faster''.<br />
<br />
Subtle differences:<br />
<br />
* Like most built-in aggregate functions, NULL values are ignored. If the most common value is NULL, built-in <tt>mode()</tt> returns the second most common value.<br />
* The built-in <tt>mode()</tt> does not return an error if the expression is NULL in all rows. Returns NULL instead.<br />
<br />
=== Usage ===<br />
New syntax is different:<br />
<br />
<syntaxhighlight lang="sql">SELECT mode() WITHIN GROUP (ORDER BY some_value) AS modal_value FROM tbl;</syntaxhighlight><br />
<br />
=== See also ===<br />
* [http://www.postgresql.org/docs/current/interactive/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE Ordered-set aggregate functions in the current manual (including <tt>mode()</tt>)]<br />
* [http://www.depesz.com/2014/01/11/waiting-for-9-4-support-ordered-set-within-group-aggregates/ Depesz review of ordered-set aggregate functions]<br />
* [http://stackoverflow.com/questions/31031039/error-within-group-is-required-for-ordered-set-aggregate-mode Stackoverflow post demonstrating confict]<br />
<br />
== mode() for Postgres 9.3 or earlier (superseded in 9.4) ==<br />
{{SnippetInfo2|Aggregate Mode|lang=SQL}}<br />
<br />
PostgreSQL makes it easy to add custom aggregate functions. This snippet is also part of the [[ulib_agg|ulib_agg user-defined library]].<br />
<br />
Conceptually the process will be to gather each value into an array. Then once all values are in the array, we will run a function to find the most common value in our array. Then you need to create a function to find the most common value in the array.<br />
<br />
<syntaxhighlight lang="sql"><br />
CREATE OR REPLACE FUNCTION _final_mode(anyarray)<br />
RETURNS anyelement AS<br />
$BODY$<br />
SELECT a<br />
FROM unnest($1) a<br />
GROUP BY 1 <br />
ORDER BY COUNT(1) DESC, 1<br />
LIMIT 1;<br />
$BODY$<br />
LANGUAGE sql IMMUTABLE;<br />
<br />
-- Tell Postgres how to use our aggregate<br />
CREATE AGGREGATE mode(anyelement) (<br />
SFUNC=array_append, --Function to call for each row. Just builds the array<br />
STYPE=anyarray,<br />
FINALFUNC=_final_mode, --Function to call after everything has been added to array<br />
INITCOND='{}' --Initialize an empty array when starting<br />
);<br />
</syntaxhighlight><br />
<br />
=== Usage ===<br />
<syntaxhighlight lang="sql">SELECT mode(some_value) AS modal_value FROM t;</syntaxhighlight><br />
<br />
=== Caution ===<br />
Returns error if the argument is a column of NULL values.<br />
<br />
If you are on PostgreSQL 8.3 or below you will need to add the [[Array Unnest|unnest()]] function to convert an array to a set of rows.<br />
<br />
=== See also ===<br />
* [[Aggregate Range]] <br />
* [[Aggregate Median]]<br />
* <tt>most_common_vals()</tt> [http://www.postgresql.org/docs/9.0/static/view-pg-stats.html pg_stats] function <br />
<br />
=== External links ===<br />
* [http://www.databasesoup.com/2013/04/a-very-simple-custom-aggregate.html A very simple custom aggregate]<br />
<br />
[[Category:SQL]]<br />
[[Category:Snippets]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Aggregate_Histogram&diff=38691Aggregate Histogram2024-02-29T11:37:08Z<p>Alvherre: fix markup</p>
<hr />
<div>A [http://en.wikipedia.org/wiki/Histogram histogram] represents the distribution of a set of values.<br />
<br />
== histogram() ==<br />
{{SnippetInfo2|Aggregate Histogram|lang=PLPGSQL}}<br />
<br />
We write a function which runs width_bucket on each value to be aggregated and uses that to increment the corresponding bucket. The state is stored as an array of integers. The array has an element with index zero holding the count of values < MIN and an element with index (nbuckets+1) holding the count of values >= MAX.<br />
<br />
This aggregate should work for PostgreSQL > 9.0, but it was only tested with PostgreSQL 9.6. <br />
<br />
<syntaxhighlight lang="sql"><br />
CREATE OR REPLACE FUNCTION hist_sfunc (state INTEGER[], val DOUBLE PRECISION,<br />
MIN DOUBLE PRECISION, MAX DOUBLE PRECISION, nbuckets INTEGER) RETURNS INTEGER[] AS $$<br />
DECLARE<br />
bucket INTEGER;<br />
i INTEGER;<br />
BEGIN<br />
-- Do nothing if val is NULL<br />
IF val IS NULL THEN<br />
RETURN state;<br />
END IF;<br />
<br />
-- This will put values in buckets with a 0 bucket for <MIN and a (nbuckets+1) bucket for >=MAX<br />
bucket := width_bucket(val, MIN, MAX, nbuckets);<br />
<br />
-- Init the array with the correct number of 0's so the caller doesn't see NULLs<br />
IF state[0] IS NULL THEN<br />
state := array_fill(0,ARRAY[nbuckets+2],ARRAY[0]);<br />
END IF;<br />
<br />
state[bucket] := state[bucket] + 1;<br />
<br />
RETURN state;<br />
END;<br />
$$ LANGUAGE plpgsql IMMUTABLE;<br />
<br />
-- Tell Postgres how to use the new function<br />
DROP AGGREGATE IF EXISTS histogram (DOUBLE PRECISION, DOUBLE PRECISION, DOUBLE PRECISION, INTEGER);<br />
CREATE AGGREGATE histogram (val DOUBLE PRECISION, min DOUBLE PRECISION, max DOUBLE PRECISION, nbuckets INTEGER) (<br />
SFUNC = hist_sfunc,<br />
STYPE = INTEGER[],<br />
PARALLEL = SAFE -- Remove line for compatibility with Postgresql < 9.6<br />
);<br />
</syntaxhighlight><br />
<br />
=== Helper functions ===<br />
<br />
We can also define some helper functions that give the midpoints, breaks and ranges of the buckets in the histogram. These functions require PostgreSQL >= 9.5.<br />
<br />
<syntaxhighlight lang="sql"><br />
CREATE OR REPLACE FUNCTION histogram_ranges(MIN DOUBLE PRECISION, MAX DOUBLE PRECISION, nbuckets INTEGER)<br />
RETURNS numrange[] AS<br />
$$<br />
DECLARE<br />
res numrange[];<br />
BEGIN<br />
res := array_agg(numrange(l,u,'[)')) FROM<br />
(SELECT generate_series(MIN::numeric,(MAX-(MAX-MIN)/nbuckets)::numeric,((MAX-MIN)/nbuckets)::numeric) AS l,<br />
generate_series((MIN+(MAX-MIN)/nbuckets)::numeric,MAX::numeric,((MAX-MIN)/nbuckets)::numeric) AS u) t;<br />
<br />
res[0] := numrange(NULL,MIN::numeric,'[)');<br />
res[nbuckets+1] := numrange(MAX::numeric,NULL,'[)');<br />
<br />
RETURN res;<br />
END;<br />
$$ LANGUAGE plpgsql IMMUTABLE;<br />
<br />
CREATE OR REPLACE FUNCTION histogram_breaks(MIN DOUBLE PRECISION, MAX DOUBLE PRECISION, nbuckets INTEGER)<br />
RETURNS DOUBLE PRECISION[] AS<br />
$$<br />
SELECT array(SELECT generate_series(MIN::numeric,MAX::numeric,((MAX-MIN)/nbuckets)::numeric)::DOUBLE PRECISION)<br />
;<br />
$$ LANGUAGE sql IMMUTABLE;<br />
<br />
CREATE OR REPLACE FUNCTION histogram_mids(MIN DOUBLE PRECISION, MAX DOUBLE PRECISION, nbuckets INTEGER)<br />
RETURNS DOUBLE PRECISION[] AS<br />
$$<br />
SELECT array(SELECT generate_series((MIN + 0.5*((MAX-MIN)/nbuckets))::numeric,<br />
MAX::numeric,<br />
((MAX-MIN)/nbuckets)::numeric)::DOUBLE PRECISION);<br />
$$ LANGUAGE sql IMMUTABLE;<br />
<br />
</syntaxhighlight><br />
<br />
<br />
=== Example usage ===<br />
<br />
<syntaxhighlight lang="sql"><br />
WITH a AS (<br />
SELECT generate_series(-2,5,0.5) AS i<br />
)<br />
SELECT array_agg(i) AS values,<br />
histogram_mids(0,3,3) AS mids,<br />
histogram_breaks(0,3,3) AS breaks,<br />
histogram_ranges(0,3,3) AS ranges,<br />
histogram(i,0,3,3) AS counts,<br />
(histogram_ranges(0,3,3))[1:3] AS ranges_in_limits,<br />
(histogram(i,0,3,3))[1:3] AS counts_in_limits<br />
FROM a;<br />
</syntaxhighlight><br />
<br />
=== Caution ===<br />
Returns NULL if the argument is a column of NULL values.<br />
<br />
== See also ==<br />
* [[Aggregate Range]] <br />
* [[Aggregate Median]]<br />
* [[Aggregate Mode]] <br />
<br />
[[Category:SQL]]<br />
[[Category:Snippets]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Aggregate_Array&diff=38690Aggregate Array2024-02-29T11:36:09Z<p>Alvherre: Fix markup</p>
<hr />
<div>The buildin <tt>array_agg()</tt> have different behaviour, here using <tt>array_cat()</tt>.<br />
<br />
== array_aggcat(anyelement) ==<br />
{{SnippetInfo2|Aggregate array|lang=SQL}}<br />
This snippet is also part of the [[ulib_agg|ulib_agg user-defined library]].<br />
<br />
<syntaxhighlight lang="sql"><br />
CREATE AGGREGATE array_aggcat (anyelement)<br />
( sfunc = array_cat,<br />
stype = anyarray,<br />
initcond = '{}'<br />
);<br />
</syntaxhighlight><br />
<br />
[[Category:SQL]]<br />
[[Category:Snippets]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Mailing_List_Creation_Policies&diff=38634Mailing List Creation Policies2024-02-01T13:14:10Z<p>Alvherre: update one decade of history</p>
<hr />
<div>== Overview ==<br />
<br />
This is meant to document our existing way of creating and maintaining mailing lists. It is not an exhaustive resource, but should highlight the important ways that we keep our community involved and aware of mailing lists that are created, used and archived by the postgresql.org infrastructure teams.<br />
<br />
== Announcing the creation of a list ==<br />
<br />
When a new list is created, we should announce that it was done on pgsql-www@lists.postgresql.org.<br />
<br />
Mailing lists are archived and displayed at http://www.postgresql.org/list/<br />
<br />
[[Category:Community]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=ContributorListings&diff=38632ContributorListings2024-02-01T12:59:01Z<p>Alvherre: add to Community category</p>
<hr />
<div>== Contributor Listing Draft Policy ==<br />
<br />
People to be listed on the PostgreSQL contributor listings include only people who have made substantial, long-term contributions of time to the PostgreSQL project. One-time only contributions are not usually considered adequate for listing, unless they involve quite large amounts of code and time. Financial contributions get listed on the Sponsors page (TBD), not here.<br />
<br />
Listings will last, on average, two years before being re-evaluated. For this purpose, all listings will in the future contain information on what exactly the person contributed and when, although this information might not be displayed. Contributor listings will be updated at least once a year.<br />
<br />
Editing the contributor listings will be carried out by the Core Team.<br />
<br />
== Core Team Section ==<br />
<br />
This contains only the core team.<br />
<br />
<br />
== Major Contributor Section ==<br />
<br />
This contains listings of contributors who have made substantial positive impact on the development of the PostgreSQL Community.<br />
<br />
== Contributor Section ==<br />
<br />
Contains listings of developers who have made smaller code contributions to the PostgreSQL project. Also lists people who have made large, sustained non-code contributions of their own time to the PostgreSQL project, within the last two years.<br />
<br />
== Past Contributors ==<br />
<br />
Lists people who were in the Major Developer or Contributor section, but have stopped contributing for two years or more.<br />
<br />
== Hacker Emeritus ==<br />
<br />
Contains a listing of former core team members.<br />
<br />
[[Category:Policies]]<br />
[[Category:Community]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=So,_you_want_to_be_a_developer%3F&diff=38416So, you want to be a developer?2023-11-27T15:25:08Z<p>Alvherre: /* Projects related to PostgreSQL */ update links</p>
<hr />
<div>'''by Selena Deckelmann'''<br />
<br />
This document is meant as a guide for brand new developers, seeking to contribute to PostgreSQL, but unsure about how to get started, or "the right way" to get involved. Feedback is welcome, as are additional links to important documents, examples, tutorials and personal stories about contributing to the project.<br />
<br />
We also have a [[Developer_FAQ]]<br />
<br />
== How to get started ==<br />
<br />
=== Overview ===<br />
<br />
Contributing to core PostgreSQL requires a few basic development tools - git, a C development environment and perl. Most modern Linux and BSD operating systems come with "-devel" packages usable for your development needs. At a very high level, you will: <br />
<br />
* Get the basic tools installed and working (git, a C development environment, and perl)<br />
* Clone our git source code repository<br />
* Compile PostgreSQL and successfully run the regression test suite<br />
<br />
Now, you should be ready to start hacking on code!<br />
<br />
=== Source code ===<br />
<br />
Source code can be found at http://git.postgresql.org/gitweb?p=postgresql.git;a=summary<br />
<br />
Once you have git installed, you can check this code out locally with the command: <br />
<br />
git clone https://git.postgresql.org/git/postgresql.git<br />
<br />
While there are release tarballs available, you should use a git clone to work on code with the community.<br />
<br />
=== Hacking PostgreSQL Resources ===<br />
<br />
There are a couple of different resources out on the net about how to go about actually hacking on PostgreSQL; these are just a few:<br />
<br />
Neil Conway and Gavin Sherry's original "Introduction to Hacking PostgreSQL": http://www.neilconway.org/talks/hacking/ ; presented at PgCon 2007 (http://www.pgcon.org/2007/schedule/events/8.en.html) and the PostgreSQL 10-year Anniversary Summit<br />
<br />
Stephen Frost's 2013 PgCon Talk "Hacking PostgreSQL" (http://www.pgcon.org/2013/schedule/events/545.en.html) slides are here: http://snowman.net/~sfrost/hackingpg-pgcon13_20130506.pdf and his 2011 PgCon Talk "Review of Patch Reviewing" (http://www.pgcon.org/2011/schedule/events/368.en.html) slides are here: http://www.pgcon.org/2011/schedule/attachments/189_pg_patch_review_20110516.odp<br />
<br />
Andrew Dunstan's "How to be a Happy Hacker", video here: http://www.youtube.com/watch?v=yFDyM29tB6k<br />
<br />
Fabrízio Mello and Dickson Guedes's "Hacking PostgreSQL" youtube channel (PT-BR): https://www.youtube.com/channel/UCjq4gJg4tYy0NqEEo3t60IA<br />
<br />
=== Style Guide ===<br />
<br />
Working with our source code involves some general rules. These are documented in our core documentation: http://www.postgresql.org/docs/current/static/source.html<br />
<br />
At a high level, we use 4-space tabbed indenting, strict ANSI C comment formatting, and our variable and function naming convention is to match the surrounding code. For example, if you see that variables use a CamelCase style, match that. If they use underscores, or are lowercase, match that. Readability and consistency within a section of code is of greater importance than universal consistency. If a section of code is being substantially reworked, developers sometimes will rework private function names and variable names to match current convention. However, projects to simply rename variables for the sake of renaming them to match current notions of coding style will be rejected.<br />
<br />
=== Bug fixing ===<br />
<br />
Bugs are posted to the mailing list: pgsql-bugs@postgresql.org<br />
<br />
You can see an archive of reported bugs at: https://www.postgresql.org/list/pgsql-bugs/<br />
<br />
Typically, a bug is posted via our [http://www.postgresql.org/support/submitbug/ bug reporting form], and then members of this list respond. Of course, not every issue posted to this list is actually a bug. A good way to learn more about how this process works is to subscribe to the list and observe for a while, before jumping in. Our software is quite complex, and development work spans a couple decades. With that history, many changes and ideas have been suggested, attempted, failed and succeeded. Please don't be discouraged if your initial ideas are rejected, significantly refactored or long-time contributors provide critical feedback to ideas or code. If contributors are responding, it is likely that they are attempting to provide direction to your work and suggesting that you try a different approach, rather than give up.<br />
<br />
Another source of bug reports is the pgsql-general@postgresql.org mailing list. Subscribing and responding to issues posted to this list is a great way to become familiar with the common problems everyday users of PostgreSQL face. Many members of our development community are on this list and respond regularly to user issues. Reading archives, and attempting to respond to issues as they come up is a significant and useful contribution to our community. <br />
<br />
Generally speaking, bug fixes are back-ported to affected branches whenever possible.<br />
<br />
=== TODOs ===<br />
<br />
It's worth checking if the feature of interest is found in the TODO list on our wiki: http://wiki.postgresql.org/wiki/TODO.<br />
<br />
The entries there often have additional information about the feature and may point to reasons why it hasn't been implemented yet. We have attempted to organize issues and link in relevant discussions from our mailing list archives. Please read background information if it is available before attempting to resolve a TODO. If no background information is available, it is appropriate to post a question to pgsql-hackers@postgresql.org and request additional information and inquire about the status of any ongoing work on the problem.<br />
<br />
=== Searching the PostgreSQL archives ===<br />
<br />
Starting a project should always begin with a search of our PostgreSQL mailing list archives. You can start at: https://www.postgresql.org/list/pgsql-hackers/.<br />
<br />
Our project's policy is to discuss as much of ongoing code work in public, including any in-progress patches, whenever possible. You may find significant and useful, but uncommitted, code in our archives that can either inform you about current or past work, or reduce the amount of work needed to accomplish a task. There are also some changes to our core project that were rejected, but are perfectly reasonable solutions to problems. Bottom line; searching our archives is a critical skill any member of our community must learn to be effective.<br />
<br />
=== Brand new features ===<br />
<br />
If you have a brand new idea for PostgreSQL, and you've already looked through our archives, scanned the TODO list and reviewed the code relevant to the change you'd like to make, it's time to dive into the pgsql-hackers@postgresql.org mailing list. <br />
<br />
This is a very active list - posting 20-100 or more messages a day. If you are working on a project, it is prudent that you subscribe to the mailing list for at least the duration of the project. The list is quite large, and is made up of contributors and observers from the last 15 years of development effort.<br />
<br />
Bruce Momjian created a presentation on how to get your patch accepted by the PostgreSQL community: http://momjian.us/main/writings/pgsql/patch.pdf<br />
<br />
Your initial post for a new project to our mailing list should include: <br />
<br />
* A description of the problem to be solved, or feature to be implemented.<br />
* Links to relevant standards documentation.<br />
* A short description of the areas of source code to be modified.<br />
* Intended timeline for implementation.<br />
* Links to relevant previous discussions on PostgreSQL mailing lists about the problem or feature.<br />
* CC any members of the development community you'll be directly working with on the project.<br />
* Link to a wiki page on wiki.postgresql.org for ongoing status updates.<br />
<br />
Your best chance of success in implementing a new feature is getting early involvement from members of the development community. It is entirely appropriate and necessary to initiate conversations about features on the pgsql-hackers mailing list, and request feedback in public from those developers who have worked on relevant or similar features in the past. We encourage this communication, and most active developers are willing and interested in providing mentorship in public for work that you undertake.<br />
<br />
New features are always committed to 'master' (the development branch in the git repository). It is the project's policy not to add features to released major versions.<br />
<br />
=== Commitfest and timing ===<br />
<br />
The Commitfest process was designed to keep track of incoming patches, help synchronize development and commit effort, make the review process more obvious and transparent, and to encourage new people to participate in the development of PostgreSQL. <br />
<br />
Developers are required to submit patches to the pgsql-hackers@postgresql.org mailing list before they will be reviewed. Once the email with the patch has been archived on the postgresql.org site, the patch can be linked into the Commitfest application (https://commitfest.postgresql.org). Commitfests are scheduled to start on the 15th of the month, and occur about every two months. We have had about five commitfests per year since the process was created.<br />
<br />
Not all patches are required to go through the commitfest process, although most of any substantial size or requiring detailed code review will.<br />
<br />
For the last couple of years, getting a major feature into a major dot release generally requires getting the patch into the review queue sometime between July-December. Feature freeze may happen in February, and new features will not be accepted until the new major release is complete. (A description and commentary on this is available at: http://rhaas.blogspot.com/2010/07/concurrent-development.html)<br />
<br />
More information about Commitfests is at: http://wiki.postgresql.org/wiki/CommitFest<br />
<br />
== Participating in the development community ==<br />
<br />
Information about the mailing lists is available on the [[Mailing Lists]] page, also reproduced here<br />
for your convenience.<br />
<br />
=== Mailing List Culture === <br />
<br />
The PostgreSQL community exists world-wide on our mailing lists. As you dive into our community, you will encounter people with wildly varying levels of expertise for databases, software development and system administration. Excellent technical and professional advice is given freely on the mailing lists, but there is no guarantee or expectation that anyone can solve any particular problem. Flaming or personal attacks are not tolerated on our mailing lists, IRC or related forums connected to the postgresql.org site. <br />
<br />
Above all, the PostgreSQL community's expectation is that each person treats the other with respect, and grants each other the benefit-of-the-doubt when it comes to terse or critical language. The Robustness Principle applies to participation in our community: Be conservative in what you send; be liberal in what you accept.<br />
<br />
That said, our community is known for its aggressive and technical discussion style. For those unfamiliar with our community, our discussions can come across as insulting or overly critical. Please keep in mind that as a new contributor, you are encountering a new culture. Every culture has different rules about appropriate behavior, social norms, and expectations. Much like when learning a new language or visiting a new, unfamiliar country, your experiences while joining the PostgreSQL community will undoubtably include an "adjustment cycle". That can and likely will include high and low moments, friendly or otherwise.<br />
<br />
As with any encounter with unfamiliar culture, you must take some time to get acquainted. Take extra time to communicate clearly. Ask for clarification if you're confused or a response doesn't make sense to you. Be careful to avoid personal attacks if someone makes a mistake. If there's one universal constant, it is that everyone makes mistakes.<br />
<br />
Remember that we are a learning community, and with few exceptions, people are communicating with the intention of learning, sharing and refining ideas.<br />
<br />
=== Email etiquette mechanics ===<br />
<br />
Signatures that include "confidentiality notices" are useless in the context of PostgreSQL mailing lists. All messages to our lists are archived publicly, are immediately available worldwide and will not be removed from our archives. Please remove the notices from your email to our lists, particularly when posting code that you wish to be contributed or shared with our community.<br />
<br />
When replying, please be respectful and use appropriate quoting. See the [https://web.archive.org/web/20170426175120/http://www.gweep.ca/~edmonds/usenet/ml-etiquette.html Mailing List Etiquette FAQ] for details about what constitutes appropriate quoting when replying to mailing lists. <br />
<br />
Our mailing lists are generally set to "reply to sender", but the preferred way to participate in threads is to "reply all". That means that you'll include both the email address of the sender and the mailing list in your response. Also, please do not send HTML-enriched email to the mailing lists.<br />
<br />
Finally, our community generally does not "top post" in response to mailing list threads (See [https://en.wikipedia.org/wiki/Posting_style#Top-posting Wikipedia: Top Posting]for a definition of top posting).<br />
<br />
=== Using the discussion lists ===<br />
<br />
You can send an email directly to any of the mailing lists, without subscribing first. <br />
Any responses you receive or send should be sent to the list ''and'' CC correspondents.<br />
<br />
If you wish to receive the mail traffic sent to a list, you can join using the [https://lists.postgresql.org/ subscribe] form. You should receive an email in response from the mailing list manager software that handles the lists. If you wish change the various settings associated with your subscription or unsubscribe, you can do so using the [https://lists.postgresql.org/ web] interface.<br />
<br />
If you follow discussion through the web interface instead of subscribing,<br />
you will at some point wish to reply to a message sent to the list. '''Do not''' simply copy<br />
the message body and paste it into a message with a similar subject as a way to join the conversation.<br />
The mailing list relies on the "In-Reply-To" mail header in order to associate individual messages<br />
to their thread. If you don't know how to add this header manually, you should instead make use<br />
of the "raw" link [https://www.postgresql.org/message-id/CA+OCxoxAm_iEh21sxHiYzZxK9_3JjdzHLX4ib--ZbH73yfb_zA@mail.gmail.com provided] on every message view to download the message as a file<br />
(in mbox format), then import it into your favorite email client and use the usual "Reply All"<br />
way of responding to mailing list messages.<br />
<br />
=== Overview of discussion lists ===<br />
<br />
We have two primary lists related to usage and development of postgresql: [https://www.postgresql.org/list/pgsql-general/ pgsql-general@postgresql.org] and [https://www.postgresql.org/list/pgsql-hackers/ pgsql-hackers@postgresql.org]. pgsql-general is the correct place to start if you are having a problem with your PostgreSQL installation, need help with installation, are a software developer using PostgreSQL or have a general question about the project. pgsql-hackers is the correct place to go if you have a patch to submit, would like to learn more about how to develop PostgreSQL itself, or are interested in database internals. We also have the [https://www.postgresql.org/list/pgsql-novice/ pgsql-novice@postgresql.org] list if you would like to try posting a question a smaller list, with a group of people who are there specifically to answer very basic questions.<br />
<br />
If you are primarily interested in performance tuning, benchmarking or case studies from existing users regarding performance, [https://www.postgresql.org/list/pgsql-performance/ pgsql-performance@postgresql.org] is a great list to join.<br />
<br />
If you're interested in contributing to website maintenance or editing, or system administration of PostgreSQL infrastructure, join the [https://www.postgresql.org/list/pgsql-www/ pgsql-www@postgresql.org] mailing list.<br />
<br />
If you have something to contribute to the PostgreSQL documentation, join the [https://www.postgresql.org/list/pgsql-docs/ pgsql-docs@postgresql.org] mailing list. The documentation is always in need of copy editors, testers and example generation.<br />
<br />
If you're interested in staffing booths at conferences, giving talks at conferences, starting a user group or participating in a user group, join the [https://www.postgresql.org/list/pgsql-advocacy/ pgsql-advocacy@postgresql.org] mailing list. We are always in need of booth volunteers, speakers, case study writers and bloggers.<br />
<br />
If you think you've found a bug in PostgreSQL and are new to our project, we suggest you ask about it on the [https://www.postgresql.org/list/pgsql-general/ pgsql-general] list first, and then read our [http://www.postgresql.org/docs/current/static/bug-reporting.html Bug Submission Guidelines] and then go to our [http://www.postgresql.org/support/submitbug Bug Reporting form].<br />
<br />
We also have User Group mailing lists, language-specific lists and some other specific projects with their own communities. You can find a comprehensive list of these at: [http://www.postgresql.org/community/lists/ http://www.postgresql.org/community/lists/]<br />
<br />
=== Wiki ===<br />
<br />
Our wiki is active and frequently updated at: http://wiki.postgresql.org. We encourage contributors to add to the material there, and to make corrections to any errors found.<br />
<br />
=== Projects related to PostgreSQL ===<br />
<br />
There are hundreds of projects that are dependent upon, related to or extend PostgreSQL. You can find partial list of those projects at [https://www.postgresql.org/docs/current/external-projects.html External Projects] doc page, [https://pgxn.org/ PGXN] or the [https://www.postgresql.org/download/product-categories/ the software catalogue]. Projects are written in a variety of languages, supported by international teams, and are generally fun to hack on. Spend some time exploring the ecosystem of projects around PostgreSQL to get a better feel for the variety and scope of ways that our database is used worldwide.<br />
<br />
=== Our philosophy about conversations/code in public ===<br />
<br />
The PostgreSQL project believes that public code review is the way to achieve our excellent quality of code. Therefore, patches for PostgreSQL must be discussed and submitted in public, and all patches are reviewed publicly. One exception to this policy is that security vulnerabilities may be disclosed to a private mailing list before fixes are published to help prevent exploitation of vulnerable users. <br />
<br />
Related to that, conversations about code, design decisions and user experience occur on the mailing lists. We try to steer all project conversations to the mailing lists so that there is a record of the thought process behind decisions, and so that all the participants and observers of our lists can learn from them.<br />
<br />
=== Resources on contributing to PostgreSQL ===<br />
<br />
* Submitting A Patch: http://wiki.postgresql.org/wiki/Submitting_a_Patch<br />
* Greg Smith, Exposing PostgreSQL Internals with User-Defined Functions http://www.pgcon.org/2010/schedule/attachments/142_HackingWithUDFs.pdf<br />
* Josh Berkus, 50 ways to contribute to PostgreSQL http://www.slideshare.net/PGExperts/50-ways-to-love-your-project<br />
* Laetitia Avrot, De-mystifying contributing to PostgreSQL https://www.slideshare.net/LtitiaAvrot/demystifying-contributing-to-postgresql<br />
<br />
== Acknowledgments ==<br />
<br />
Thanks to Dave Page for feedback, editing and lots of questions.<br />
<br />
[[Category:Community]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=So,_you_want_to_be_a_developer%3F&diff=38409So, you want to be a developer?2023-11-27T10:05:49Z<p>Alvherre: /* Searching the PostgreSQL archives */ update link to ml archives</p>
<hr />
<div>'''by Selena Deckelmann'''<br />
<br />
This document is meant as a guide for brand new developers, seeking to contribute to PostgreSQL, but unsure about how to get started, or "the right way" to get involved. Feedback is welcome, as are additional links to important documents, examples, tutorials and personal stories about contributing to the project.<br />
<br />
We also have a [[Developer_FAQ]]<br />
<br />
== How to get started ==<br />
<br />
=== Overview ===<br />
<br />
Contributing to core PostgreSQL requires a few basic development tools - git, a C development environment and perl. Most modern Linux and BSD operating systems come with "-devel" packages usable for your development needs. At a very high level, you will: <br />
<br />
* Get the basic tools installed and working (git, a C development environment, and perl)<br />
* Clone our git source code repository<br />
* Compile PostgreSQL and successfully run the regression test suite<br />
<br />
Now, you should be ready to start hacking on code!<br />
<br />
=== Source code ===<br />
<br />
Source code can be found at http://git.postgresql.org/gitweb?p=postgresql.git;a=summary<br />
<br />
Once you have git installed, you can check this code out locally with the command: <br />
<br />
git clone https://git.postgresql.org/git/postgresql.git<br />
<br />
While there are release tarballs available, you should use a git clone to work on code with the community.<br />
<br />
=== Hacking PostgreSQL Resources ===<br />
<br />
There are a couple of different resources out on the net about how to go about actually hacking on PostgreSQL; these are just a few:<br />
<br />
Neil Conway and Gavin Sherry's original "Introduction to Hacking PostgreSQL": http://www.neilconway.org/talks/hacking/ ; presented at PgCon 2007 (http://www.pgcon.org/2007/schedule/events/8.en.html) and the PostgreSQL 10-year Anniversary Summit<br />
<br />
Stephen Frost's 2013 PgCon Talk "Hacking PostgreSQL" (http://www.pgcon.org/2013/schedule/events/545.en.html) slides are here: http://snowman.net/~sfrost/hackingpg-pgcon13_20130506.pdf and his 2011 PgCon Talk "Review of Patch Reviewing" (http://www.pgcon.org/2011/schedule/events/368.en.html) slides are here: http://www.pgcon.org/2011/schedule/attachments/189_pg_patch_review_20110516.odp<br />
<br />
Andrew Dunstan's "How to be a Happy Hacker", video here: http://www.youtube.com/watch?v=yFDyM29tB6k<br />
<br />
Fabrízio Mello and Dickson Guedes's "Hacking PostgreSQL" youtube channel (PT-BR): https://www.youtube.com/channel/UCjq4gJg4tYy0NqEEo3t60IA<br />
<br />
=== Style Guide ===<br />
<br />
Working with our source code involves some general rules. These are documented in our core documentation: http://www.postgresql.org/docs/current/static/source.html<br />
<br />
At a high level, we use 4-space tabbed indenting, strict ANSI C comment formatting, and our variable and function naming convention is to match the surrounding code. For example, if you see that variables use a CamelCase style, match that. If they use underscores, or are lowercase, match that. Readability and consistency within a section of code is of greater importance than universal consistency. If a section of code is being substantially reworked, developers sometimes will rework private function names and variable names to match current convention. However, projects to simply rename variables for the sake of renaming them to match current notions of coding style will be rejected.<br />
<br />
=== Bug fixing ===<br />
<br />
Bugs are posted to the mailing list: pgsql-bugs@postgresql.org<br />
<br />
You can see an archive of reported bugs at: https://www.postgresql.org/list/pgsql-bugs/<br />
<br />
Typically, a bug is posted via our [http://www.postgresql.org/support/submitbug/ bug reporting form], and then members of this list respond. Of course, not every issue posted to this list is actually a bug. A good way to learn more about how this process works is to subscribe to the list and observe for a while, before jumping in. Our software is quite complex, and development work spans a couple decades. With that history, many changes and ideas have been suggested, attempted, failed and succeeded. Please don't be discouraged if your initial ideas are rejected, significantly refactored or long-time contributors provide critical feedback to ideas or code. If contributors are responding, it is likely that they are attempting to provide direction to your work and suggesting that you try a different approach, rather than give up.<br />
<br />
Another source of bug reports is the pgsql-general@postgresql.org mailing list. Subscribing and responding to issues posted to this list is a great way to become familiar with the common problems everyday users of PostgreSQL face. Many members of our development community are on this list and respond regularly to user issues. Reading archives, and attempting to respond to issues as they come up is a significant and useful contribution to our community. <br />
<br />
Generally speaking, bug fixes are back-ported to affected branches whenever possible.<br />
<br />
=== TODOs ===<br />
<br />
It's worth checking if the feature of interest is found in the TODO list on our wiki: http://wiki.postgresql.org/wiki/TODO.<br />
<br />
The entries there often have additional information about the feature and may point to reasons why it hasn't been implemented yet. We have attempted to organize issues and link in relevant discussions from our mailing list archives. Please read background information if it is available before attempting to resolve a TODO. If no background information is available, it is appropriate to post a question to pgsql-hackers@postgresql.org and request additional information and inquire about the status of any ongoing work on the problem.<br />
<br />
=== Searching the PostgreSQL archives ===<br />
<br />
Starting a project should always begin with a search of our PostgreSQL mailing list archives. You can start at: https://www.postgresql.org/list/pgsql-hackers/.<br />
<br />
Our project's policy is to discuss as much of ongoing code work in public, including any in-progress patches, whenever possible. You may find significant and useful, but uncommitted, code in our archives that can either inform you about current or past work, or reduce the amount of work needed to accomplish a task. There are also some changes to our core project that were rejected, but are perfectly reasonable solutions to problems. Bottom line; searching our archives is a critical skill any member of our community must learn to be effective.<br />
<br />
=== Brand new features ===<br />
<br />
If you have a brand new idea for PostgreSQL, and you've already looked through our archives, scanned the TODO list and reviewed the code relevant to the change you'd like to make, it's time to dive into the pgsql-hackers@postgresql.org mailing list. <br />
<br />
This is a very active list - posting 20-100 or more messages a day. If you are working on a project, it is prudent that you subscribe to the mailing list for at least the duration of the project. The list is quite large, and is made up of contributors and observers from the last 15 years of development effort.<br />
<br />
Bruce Momjian created a presentation on how to get your patch accepted by the PostgreSQL community: http://momjian.us/main/writings/pgsql/patch.pdf<br />
<br />
Your initial post for a new project to our mailing list should include: <br />
<br />
* A description of the problem to be solved, or feature to be implemented.<br />
* Links to relevant standards documentation.<br />
* A short description of the areas of source code to be modified.<br />
* Intended timeline for implementation.<br />
* Links to relevant previous discussions on PostgreSQL mailing lists about the problem or feature.<br />
* CC any members of the development community you'll be directly working with on the project.<br />
* Link to a wiki page on wiki.postgresql.org for ongoing status updates.<br />
<br />
Your best chance of success in implementing a new feature is getting early involvement from members of the development community. It is entirely appropriate and necessary to initiate conversations about features on the pgsql-hackers mailing list, and request feedback in public from those developers who have worked on relevant or similar features in the past. We encourage this communication, and most active developers are willing and interested in providing mentorship in public for work that you undertake.<br />
<br />
New features are always committed to 'master' (the development branch in the git repository). It is the project's policy not to add features to released major versions.<br />
<br />
=== Commitfest and timing ===<br />
<br />
The Commitfest process was designed to keep track of incoming patches, help synchronize development and commit effort, make the review process more obvious and transparent, and to encourage new people to participate in the development of PostgreSQL. <br />
<br />
Developers are required to submit patches to the pgsql-hackers@postgresql.org mailing list before they will be reviewed. Once the email with the patch has been archived on the postgresql.org site, the patch can be linked into the Commitfest application (https://commitfest.postgresql.org). Commitfests are scheduled to start on the 15th of the month, and occur about every two months. We have had about five commitfests per year since the process was created.<br />
<br />
Not all patches are required to go through the commitfest process, although most of any substantial size or requiring detailed code review will.<br />
<br />
For the last couple of years, getting a major feature into a major dot release generally requires getting the patch into the review queue sometime between July-December. Feature freeze may happen in February, and new features will not be accepted until the new major release is complete. (A description and commentary on this is available at: http://rhaas.blogspot.com/2010/07/concurrent-development.html)<br />
<br />
More information about Commitfests is at: http://wiki.postgresql.org/wiki/CommitFest<br />
<br />
== Participating in the development community ==<br />
<br />
Information about the mailing lists is available on the [[Mailing Lists]] page, also reproduced here<br />
for your convenience.<br />
<br />
=== Mailing List Culture === <br />
<br />
The PostgreSQL community exists world-wide on our mailing lists. As you dive into our community, you will encounter people with wildly varying levels of expertise for databases, software development and system administration. Excellent technical and professional advice is given freely on the mailing lists, but there is no guarantee or expectation that anyone can solve any particular problem. Flaming or personal attacks are not tolerated on our mailing lists, IRC or related forums connected to the postgresql.org site. <br />
<br />
Above all, the PostgreSQL community's expectation is that each person treats the other with respect, and grants each other the benefit-of-the-doubt when it comes to terse or critical language. The Robustness Principle applies to participation in our community: Be conservative in what you send; be liberal in what you accept.<br />
<br />
That said, our community is known for its aggressive and technical discussion style. For those unfamiliar with our community, our discussions can come across as insulting or overly critical. Please keep in mind that as a new contributor, you are encountering a new culture. Every culture has different rules about appropriate behavior, social norms, and expectations. Much like when learning a new language or visiting a new, unfamiliar country, your experiences while joining the PostgreSQL community will undoubtably include an "adjustment cycle". That can and likely will include high and low moments, friendly or otherwise.<br />
<br />
As with any encounter with unfamiliar culture, you must take some time to get acquainted. Take extra time to communicate clearly. Ask for clarification if you're confused or a response doesn't make sense to you. Be careful to avoid personal attacks if someone makes a mistake. If there's one universal constant, it is that everyone makes mistakes.<br />
<br />
Remember that we are a learning community, and with few exceptions, people are communicating with the intention of learning, sharing and refining ideas.<br />
<br />
=== Email etiquette mechanics ===<br />
<br />
Signatures that include "confidentiality notices" are useless in the context of PostgreSQL mailing lists. All messages to our lists are archived publicly, are immediately available worldwide and will not be removed from our archives. Please remove the notices from your email to our lists, particularly when posting code that you wish to be contributed or shared with our community.<br />
<br />
When replying, please be respectful and use appropriate quoting. See the [https://web.archive.org/web/20170426175120/http://www.gweep.ca/~edmonds/usenet/ml-etiquette.html Mailing List Etiquette FAQ] for details about what constitutes appropriate quoting when replying to mailing lists. <br />
<br />
Our mailing lists are generally set to "reply to sender", but the preferred way to participate in threads is to "reply all". That means that you'll include both the email address of the sender and the mailing list in your response. Also, please do not send HTML-enriched email to the mailing lists.<br />
<br />
Finally, our community generally does not "top post" in response to mailing list threads (See [https://en.wikipedia.org/wiki/Posting_style#Top-posting Wikipedia: Top Posting]for a definition of top posting).<br />
<br />
=== Using the discussion lists ===<br />
<br />
You can send an email directly to any of the mailing lists, without subscribing first. <br />
Any responses you receive or send should be sent to the list ''and'' CC correspondents.<br />
<br />
If you wish to receive the mail traffic sent to a list, you can join using the [https://lists.postgresql.org/ subscribe] form. You should receive an email in response from the mailing list manager software that handles the lists. If you wish change the various settings associated with your subscription or unsubscribe, you can do so using the [https://lists.postgresql.org/ web] interface.<br />
<br />
If you follow discussion through the web interface instead of subscribing,<br />
you will at some point wish to reply to a message sent to the list. '''Do not''' simply copy<br />
the message body and paste it into a message with a similar subject as a way to join the conversation.<br />
The mailing list relies on the "In-Reply-To" mail header in order to associate individual messages<br />
to their thread. If you don't know how to add this header manually, you should instead make use<br />
of the "raw" link [https://www.postgresql.org/message-id/CA+OCxoxAm_iEh21sxHiYzZxK9_3JjdzHLX4ib--ZbH73yfb_zA@mail.gmail.com provided] on every message view to download the message as a file<br />
(in mbox format), then import it into your favorite email client and use the usual "Reply All"<br />
way of responding to mailing list messages.<br />
<br />
=== Overview of discussion lists ===<br />
<br />
We have two primary lists related to usage and development of postgresql: [https://www.postgresql.org/list/pgsql-general/ pgsql-general@postgresql.org] and [https://www.postgresql.org/list/pgsql-hackers/ pgsql-hackers@postgresql.org]. pgsql-general is the correct place to start if you are having a problem with your PostgreSQL installation, need help with installation, are a software developer using PostgreSQL or have a general question about the project. pgsql-hackers is the correct place to go if you have a patch to submit, would like to learn more about how to develop PostgreSQL itself, or are interested in database internals. We also have the [https://www.postgresql.org/list/pgsql-novice/ pgsql-novice@postgresql.org] list if you would like to try posting a question a smaller list, with a group of people who are there specifically to answer very basic questions.<br />
<br />
If you are primarily interested in performance tuning, benchmarking or case studies from existing users regarding performance, [https://www.postgresql.org/list/pgsql-performance/ pgsql-performance@postgresql.org] is a great list to join.<br />
<br />
If you're interested in contributing to website maintenance or editing, or system administration of PostgreSQL infrastructure, join the [https://www.postgresql.org/list/pgsql-www/ pgsql-www@postgresql.org] mailing list.<br />
<br />
If you have something to contribute to the PostgreSQL documentation, join the [https://www.postgresql.org/list/pgsql-docs/ pgsql-docs@postgresql.org] mailing list. The documentation is always in need of copy editors, testers and example generation.<br />
<br />
If you're interested in staffing booths at conferences, giving talks at conferences, starting a user group or participating in a user group, join the [https://www.postgresql.org/list/pgsql-advocacy/ pgsql-advocacy@postgresql.org] mailing list. We are always in need of booth volunteers, speakers, case study writers and bloggers.<br />
<br />
If you think you've found a bug in PostgreSQL and are new to our project, we suggest you ask about it on the [https://www.postgresql.org/list/pgsql-general/ pgsql-general] list first, and then read our [http://www.postgresql.org/docs/current/static/bug-reporting.html Bug Submission Guidelines] and then go to our [http://www.postgresql.org/support/submitbug Bug Reporting form].<br />
<br />
We also have User Group mailing lists, language-specific lists and some other specific projects with their own communities. You can find a comprehensive list of these at: [http://www.postgresql.org/community/lists/ http://www.postgresql.org/community/lists/]<br />
<br />
=== Wiki ===<br />
<br />
Our wiki is active and frequently updated at: http://wiki.postgresql.org. We encourage contributors to add to the material there, and to make corrections to any errors found.<br />
<br />
=== Projects related to PostgreSQL ===<br />
<br />
There are hundreds of projects that are dependent upon, related to or extend PostgreSQL. You can find partial list of those projects at [https://www.postgresql.org/docs/current/external-projects.html External Projects] doc page, [[PGXN]] or [[Pgfoundry]]. Projects are written in a variety of languages, supported by international teams, and are generally fun to hack on. Spend some time exploring the ecosystem of projects around PostgreSQL to get a better feel for the variety and scope of ways that our database is used worldwide.<br />
<br />
=== Our philosophy about conversations/code in public ===<br />
<br />
The PostgreSQL project believes that public code review is the way to achieve our excellent quality of code. Therefore, patches for PostgreSQL must be discussed and submitted in public, and all patches are reviewed publicly. One exception to this policy is that security vulnerabilities may be disclosed to a private mailing list before fixes are published to help prevent exploitation of vulnerable users. <br />
<br />
Related to that, conversations about code, design decisions and user experience occur on the mailing lists. We try to steer all project conversations to the mailing lists so that there is a record of the thought process behind decisions, and so that all the participants and observers of our lists can learn from them.<br />
<br />
=== Resources on contributing to PostgreSQL ===<br />
<br />
* Submitting A Patch: http://wiki.postgresql.org/wiki/Submitting_a_Patch<br />
* Greg Smith, Exposing PostgreSQL Internals with User-Defined Functions http://www.pgcon.org/2010/schedule/attachments/142_HackingWithUDFs.pdf<br />
* Josh Berkus, 50 ways to contribute to PostgreSQL http://www.slideshare.net/PGExperts/50-ways-to-love-your-project<br />
* Laetitia Avrot, De-mystifying contributing to PostgreSQL https://www.slideshare.net/LtitiaAvrot/demystifying-contributing-to-postgresql<br />
<br />
== Acknowledgments ==<br />
<br />
Thanks to Dave Page for feedback, editing and lots of questions.<br />
<br />
[[Category:Community]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=So,_you_want_to_be_a_developer%3F&diff=38408So, you want to be a developer?2023-11-27T10:03:11Z<p>Alvherre: /* Source code */ markup command as code</p>
<hr />
<div>'''by Selena Deckelmann'''<br />
<br />
This document is meant as a guide for brand new developers, seeking to contribute to PostgreSQL, but unsure about how to get started, or "the right way" to get involved. Feedback is welcome, as are additional links to important documents, examples, tutorials and personal stories about contributing to the project.<br />
<br />
We also have a [[Developer_FAQ]]<br />
<br />
== How to get started ==<br />
<br />
=== Overview ===<br />
<br />
Contributing to core PostgreSQL requires a few basic development tools - git, a C development environment and perl. Most modern Linux and BSD operating systems come with "-devel" packages usable for your development needs. At a very high level, you will: <br />
<br />
* Get the basic tools installed and working (git, a C development environment, and perl)<br />
* Clone our git source code repository<br />
* Compile PostgreSQL and successfully run the regression test suite<br />
<br />
Now, you should be ready to start hacking on code!<br />
<br />
=== Source code ===<br />
<br />
Source code can be found at http://git.postgresql.org/gitweb?p=postgresql.git;a=summary<br />
<br />
Once you have git installed, you can check this code out locally with the command: <br />
<br />
git clone https://git.postgresql.org/git/postgresql.git<br />
<br />
While there are release tarballs available, you should use a git clone to work on code with the community.<br />
<br />
=== Hacking PostgreSQL Resources ===<br />
<br />
There are a couple of different resources out on the net about how to go about actually hacking on PostgreSQL; these are just a few:<br />
<br />
Neil Conway and Gavin Sherry's original "Introduction to Hacking PostgreSQL": http://www.neilconway.org/talks/hacking/ ; presented at PgCon 2007 (http://www.pgcon.org/2007/schedule/events/8.en.html) and the PostgreSQL 10-year Anniversary Summit<br />
<br />
Stephen Frost's 2013 PgCon Talk "Hacking PostgreSQL" (http://www.pgcon.org/2013/schedule/events/545.en.html) slides are here: http://snowman.net/~sfrost/hackingpg-pgcon13_20130506.pdf and his 2011 PgCon Talk "Review of Patch Reviewing" (http://www.pgcon.org/2011/schedule/events/368.en.html) slides are here: http://www.pgcon.org/2011/schedule/attachments/189_pg_patch_review_20110516.odp<br />
<br />
Andrew Dunstan's "How to be a Happy Hacker", video here: http://www.youtube.com/watch?v=yFDyM29tB6k<br />
<br />
Fabrízio Mello and Dickson Guedes's "Hacking PostgreSQL" youtube channel (PT-BR): https://www.youtube.com/channel/UCjq4gJg4tYy0NqEEo3t60IA<br />
<br />
=== Style Guide ===<br />
<br />
Working with our source code involves some general rules. These are documented in our core documentation: http://www.postgresql.org/docs/current/static/source.html<br />
<br />
At a high level, we use 4-space tabbed indenting, strict ANSI C comment formatting, and our variable and function naming convention is to match the surrounding code. For example, if you see that variables use a CamelCase style, match that. If they use underscores, or are lowercase, match that. Readability and consistency within a section of code is of greater importance than universal consistency. If a section of code is being substantially reworked, developers sometimes will rework private function names and variable names to match current convention. However, projects to simply rename variables for the sake of renaming them to match current notions of coding style will be rejected.<br />
<br />
=== Bug fixing ===<br />
<br />
Bugs are posted to the mailing list: pgsql-bugs@postgresql.org<br />
<br />
You can see an archive of reported bugs at: https://www.postgresql.org/list/pgsql-bugs/<br />
<br />
Typically, a bug is posted via our [http://www.postgresql.org/support/submitbug/ bug reporting form], and then members of this list respond. Of course, not every issue posted to this list is actually a bug. A good way to learn more about how this process works is to subscribe to the list and observe for a while, before jumping in. Our software is quite complex, and development work spans a couple decades. With that history, many changes and ideas have been suggested, attempted, failed and succeeded. Please don't be discouraged if your initial ideas are rejected, significantly refactored or long-time contributors provide critical feedback to ideas or code. If contributors are responding, it is likely that they are attempting to provide direction to your work and suggesting that you try a different approach, rather than give up.<br />
<br />
Another source of bug reports is the pgsql-general@postgresql.org mailing list. Subscribing and responding to issues posted to this list is a great way to become familiar with the common problems everyday users of PostgreSQL face. Many members of our development community are on this list and respond regularly to user issues. Reading archives, and attempting to respond to issues as they come up is a significant and useful contribution to our community. <br />
<br />
Generally speaking, bug fixes are back-ported to affected branches whenever possible.<br />
<br />
=== TODOs ===<br />
<br />
It's worth checking if the feature of interest is found in the TODO list on our wiki: http://wiki.postgresql.org/wiki/TODO.<br />
<br />
The entries there often have additional information about the feature and may point to reasons why it hasn't been implemented yet. We have attempted to organize issues and link in relevant discussions from our mailing list archives. Please read background information if it is available before attempting to resolve a TODO. If no background information is available, it is appropriate to post a question to pgsql-hackers@postgresql.org and request additional information and inquire about the status of any ongoing work on the problem.<br />
<br />
=== Searching the PostgreSQL archives ===<br />
<br />
Starting a project should always begin with a search of our PostgreSQL mailing list archives. You can start at: http://archives.postgresql.org<br />
<br />
Our project's policy is to discuss as much of ongoing code work in public, including any in-progress patches, whenever possible. You may find significant and useful, but uncommitted, code in our archives that can either inform you about current or past work, or reduce the amount of work needed to accomplish a task. There are also some changes to our core project that were rejected, but are perfectly reasonable solutions to problems. Bottom line; searching our archives is a critical skill any member of our community must learn to be effective.<br />
<br />
=== Brand new features ===<br />
<br />
If you have a brand new idea for PostgreSQL, and you've already looked through our archives, scanned the TODO list and reviewed the code relevant to the change you'd like to make, it's time to dive into the pgsql-hackers@postgresql.org mailing list. <br />
<br />
This is a very active list - posting 20-100 or more messages a day. If you are working on a project, it is prudent that you subscribe to the mailing list for at least the duration of the project. The list is quite large, and is made up of contributors and observers from the last 15 years of development effort.<br />
<br />
Bruce Momjian created a presentation on how to get your patch accepted by the PostgreSQL community: http://momjian.us/main/writings/pgsql/patch.pdf<br />
<br />
Your initial post for a new project to our mailing list should include: <br />
<br />
* A description of the problem to be solved, or feature to be implemented.<br />
* Links to relevant standards documentation.<br />
* A short description of the areas of source code to be modified.<br />
* Intended timeline for implementation.<br />
* Links to relevant previous discussions on PostgreSQL mailing lists about the problem or feature.<br />
* CC any members of the development community you'll be directly working with on the project.<br />
* Link to a wiki page on wiki.postgresql.org for ongoing status updates.<br />
<br />
Your best chance of success in implementing a new feature is getting early involvement from members of the development community. It is entirely appropriate and necessary to initiate conversations about features on the pgsql-hackers mailing list, and request feedback in public from those developers who have worked on relevant or similar features in the past. We encourage this communication, and most active developers are willing and interested in providing mentorship in public for work that you undertake.<br />
<br />
New features are always committed to 'master' (the development branch in the git repository). It is the project's policy not to add features to released major versions.<br />
<br />
=== Commitfest and timing ===<br />
<br />
The Commitfest process was designed to keep track of incoming patches, help synchronize development and commit effort, make the review process more obvious and transparent, and to encourage new people to participate in the development of PostgreSQL. <br />
<br />
Developers are required to submit patches to the pgsql-hackers@postgresql.org mailing list before they will be reviewed. Once the email with the patch has been archived on the postgresql.org site, the patch can be linked into the Commitfest application (https://commitfest.postgresql.org). Commitfests are scheduled to start on the 15th of the month, and occur about every two months. We have had about five commitfests per year since the process was created.<br />
<br />
Not all patches are required to go through the commitfest process, although most of any substantial size or requiring detailed code review will.<br />
<br />
For the last couple of years, getting a major feature into a major dot release generally requires getting the patch into the review queue sometime between July-December. Feature freeze may happen in February, and new features will not be accepted until the new major release is complete. (A description and commentary on this is available at: http://rhaas.blogspot.com/2010/07/concurrent-development.html)<br />
<br />
More information about Commitfests is at: http://wiki.postgresql.org/wiki/CommitFest<br />
<br />
== Participating in the development community ==<br />
<br />
Information about the mailing lists is available on the [[Mailing Lists]] page, also reproduced here<br />
for your convenience.<br />
<br />
=== Mailing List Culture === <br />
<br />
The PostgreSQL community exists world-wide on our mailing lists. As you dive into our community, you will encounter people with wildly varying levels of expertise for databases, software development and system administration. Excellent technical and professional advice is given freely on the mailing lists, but there is no guarantee or expectation that anyone can solve any particular problem. Flaming or personal attacks are not tolerated on our mailing lists, IRC or related forums connected to the postgresql.org site. <br />
<br />
Above all, the PostgreSQL community's expectation is that each person treats the other with respect, and grants each other the benefit-of-the-doubt when it comes to terse or critical language. The Robustness Principle applies to participation in our community: Be conservative in what you send; be liberal in what you accept.<br />
<br />
That said, our community is known for its aggressive and technical discussion style. For those unfamiliar with our community, our discussions can come across as insulting or overly critical. Please keep in mind that as a new contributor, you are encountering a new culture. Every culture has different rules about appropriate behavior, social norms, and expectations. Much like when learning a new language or visiting a new, unfamiliar country, your experiences while joining the PostgreSQL community will undoubtably include an "adjustment cycle". That can and likely will include high and low moments, friendly or otherwise.<br />
<br />
As with any encounter with unfamiliar culture, you must take some time to get acquainted. Take extra time to communicate clearly. Ask for clarification if you're confused or a response doesn't make sense to you. Be careful to avoid personal attacks if someone makes a mistake. If there's one universal constant, it is that everyone makes mistakes.<br />
<br />
Remember that we are a learning community, and with few exceptions, people are communicating with the intention of learning, sharing and refining ideas.<br />
<br />
=== Email etiquette mechanics ===<br />
<br />
Signatures that include "confidentiality notices" are useless in the context of PostgreSQL mailing lists. All messages to our lists are archived publicly, are immediately available worldwide and will not be removed from our archives. Please remove the notices from your email to our lists, particularly when posting code that you wish to be contributed or shared with our community.<br />
<br />
When replying, please be respectful and use appropriate quoting. See the [https://web.archive.org/web/20170426175120/http://www.gweep.ca/~edmonds/usenet/ml-etiquette.html Mailing List Etiquette FAQ] for details about what constitutes appropriate quoting when replying to mailing lists. <br />
<br />
Our mailing lists are generally set to "reply to sender", but the preferred way to participate in threads is to "reply all". That means that you'll include both the email address of the sender and the mailing list in your response. Also, please do not send HTML-enriched email to the mailing lists.<br />
<br />
Finally, our community generally does not "top post" in response to mailing list threads (See [https://en.wikipedia.org/wiki/Posting_style#Top-posting Wikipedia: Top Posting]for a definition of top posting).<br />
<br />
=== Using the discussion lists ===<br />
<br />
You can send an email directly to any of the mailing lists, without subscribing first. <br />
Any responses you receive or send should be sent to the list ''and'' CC correspondents.<br />
<br />
If you wish to receive the mail traffic sent to a list, you can join using the [https://lists.postgresql.org/ subscribe] form. You should receive an email in response from the mailing list manager software that handles the lists. If you wish change the various settings associated with your subscription or unsubscribe, you can do so using the [https://lists.postgresql.org/ web] interface.<br />
<br />
If you follow discussion through the web interface instead of subscribing,<br />
you will at some point wish to reply to a message sent to the list. '''Do not''' simply copy<br />
the message body and paste it into a message with a similar subject as a way to join the conversation.<br />
The mailing list relies on the "In-Reply-To" mail header in order to associate individual messages<br />
to their thread. If you don't know how to add this header manually, you should instead make use<br />
of the "raw" link [https://www.postgresql.org/message-id/CA+OCxoxAm_iEh21sxHiYzZxK9_3JjdzHLX4ib--ZbH73yfb_zA@mail.gmail.com provided] on every message view to download the message as a file<br />
(in mbox format), then import it into your favorite email client and use the usual "Reply All"<br />
way of responding to mailing list messages.<br />
<br />
=== Overview of discussion lists ===<br />
<br />
We have two primary lists related to usage and development of postgresql: [https://www.postgresql.org/list/pgsql-general/ pgsql-general@postgresql.org] and [https://www.postgresql.org/list/pgsql-hackers/ pgsql-hackers@postgresql.org]. pgsql-general is the correct place to start if you are having a problem with your PostgreSQL installation, need help with installation, are a software developer using PostgreSQL or have a general question about the project. pgsql-hackers is the correct place to go if you have a patch to submit, would like to learn more about how to develop PostgreSQL itself, or are interested in database internals. We also have the [https://www.postgresql.org/list/pgsql-novice/ pgsql-novice@postgresql.org] list if you would like to try posting a question a smaller list, with a group of people who are there specifically to answer very basic questions.<br />
<br />
If you are primarily interested in performance tuning, benchmarking or case studies from existing users regarding performance, [https://www.postgresql.org/list/pgsql-performance/ pgsql-performance@postgresql.org] is a great list to join.<br />
<br />
If you're interested in contributing to website maintenance or editing, or system administration of PostgreSQL infrastructure, join the [https://www.postgresql.org/list/pgsql-www/ pgsql-www@postgresql.org] mailing list.<br />
<br />
If you have something to contribute to the PostgreSQL documentation, join the [https://www.postgresql.org/list/pgsql-docs/ pgsql-docs@postgresql.org] mailing list. The documentation is always in need of copy editors, testers and example generation.<br />
<br />
If you're interested in staffing booths at conferences, giving talks at conferences, starting a user group or participating in a user group, join the [https://www.postgresql.org/list/pgsql-advocacy/ pgsql-advocacy@postgresql.org] mailing list. We are always in need of booth volunteers, speakers, case study writers and bloggers.<br />
<br />
If you think you've found a bug in PostgreSQL and are new to our project, we suggest you ask about it on the [https://www.postgresql.org/list/pgsql-general/ pgsql-general] list first, and then read our [http://www.postgresql.org/docs/current/static/bug-reporting.html Bug Submission Guidelines] and then go to our [http://www.postgresql.org/support/submitbug Bug Reporting form].<br />
<br />
We also have User Group mailing lists, language-specific lists and some other specific projects with their own communities. You can find a comprehensive list of these at: [http://www.postgresql.org/community/lists/ http://www.postgresql.org/community/lists/]<br />
<br />
=== Wiki ===<br />
<br />
Our wiki is active and frequently updated at: http://wiki.postgresql.org. We encourage contributors to add to the material there, and to make corrections to any errors found.<br />
<br />
=== Projects related to PostgreSQL ===<br />
<br />
There are hundreds of projects that are dependent upon, related to or extend PostgreSQL. You can find partial list of those projects at [https://www.postgresql.org/docs/current/external-projects.html External Projects] doc page, [[PGXN]] or [[Pgfoundry]]. Projects are written in a variety of languages, supported by international teams, and are generally fun to hack on. Spend some time exploring the ecosystem of projects around PostgreSQL to get a better feel for the variety and scope of ways that our database is used worldwide.<br />
<br />
=== Our philosophy about conversations/code in public ===<br />
<br />
The PostgreSQL project believes that public code review is the way to achieve our excellent quality of code. Therefore, patches for PostgreSQL must be discussed and submitted in public, and all patches are reviewed publicly. One exception to this policy is that security vulnerabilities may be disclosed to a private mailing list before fixes are published to help prevent exploitation of vulnerable users. <br />
<br />
Related to that, conversations about code, design decisions and user experience occur on the mailing lists. We try to steer all project conversations to the mailing lists so that there is a record of the thought process behind decisions, and so that all the participants and observers of our lists can learn from them.<br />
<br />
=== Resources on contributing to PostgreSQL ===<br />
<br />
* Submitting A Patch: http://wiki.postgresql.org/wiki/Submitting_a_Patch<br />
* Greg Smith, Exposing PostgreSQL Internals with User-Defined Functions http://www.pgcon.org/2010/schedule/attachments/142_HackingWithUDFs.pdf<br />
* Josh Berkus, 50 ways to contribute to PostgreSQL http://www.slideshare.net/PGExperts/50-ways-to-love-your-project<br />
* Laetitia Avrot, De-mystifying contributing to PostgreSQL https://www.slideshare.net/LtitiaAvrot/demystifying-contributing-to-postgresql<br />
<br />
== Acknowledgments ==<br />
<br />
Thanks to Dave Page for feedback, editing and lots of questions.<br />
<br />
[[Category:Community]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PGFGMembers&diff=38358PGFGMembers2023-10-20T14:19:42Z<p>Alvherre: move Hans-Jürgen to Austria</p>
<hr />
<div>== Active PGFG Members ==<br />
<br />
{| cellpadding="5" cellspacing="0" border="1"<br />
!Name<br />
!Location<br />
!Role<br />
|-<br />
<br />
|Andreas Scherbaum<br />
| Germany<br />
|Member<br />
|-<br />
<br />
|Dave Cramer<br />
|Canada<br />
|Backup Liaison<br />
|-<br />
<br />
|Dave Page<br />
|U.K.<br />
|Member<br />
|-<br />
<br />
|David Fetter<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Devrim Gunduz<br />
|U.K.<br />
|Member<br />
|-<br />
<br />
|Granthana Biswas<br />
|Germany<br />
|Member<br />
|-<br />
<br />
|Greg Sabino Mullane<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Hans-Jürgen Schönig<br />
|Austria<br />
|Member<br />
|-<br />
<br />
|Joe Conway<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Jonathan Katz<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Lætitia Avrot<br />
|France<br />
|Member<br />
|-<br />
<br />
|Larry Rosenman<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Marc Fournier<br />
|Canada<br />
|Member<br />
|-<br />
<br />
|Mark Wong<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Michael Meskes<br />
|Germany<br />
|Member<br />
|-<br />
<br />
|Noah Misch<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Oleg Bartunov<br />
|Russian Fed.<br />
|Member<br />
|-<br />
<br />
|Peter Eisentraut<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Renee Phillips<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Robert Treat<br />
|U.S.A<br />
|Liaison<br />
|-<br />
<br />
|Valeria Kaplan<br />
|U.K.<br />
|Member<br />
|-<br />
<br />
|Vik Fearing<br />
|France<br />
|Member<br />
|-<br />
<br />
|}<br />
<br />
== Past PGFG Members ==<br />
<br />
{| cellpadding="5" cellspacing="0" border="1"<br />
!Name<br />
!Location<br />
!Role<br />
|-<br />
<br />
|A. Elein Mustain<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Andrew Sullivan<br />
|Canada<br />
|Member<br />
|-<br />
<br />
|Christopher Browne<br />
|Canada<br />
|Member<br />
|-<br />
<br />
|Gavin M. Roy<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Josh Berkus<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|Lamar Owen<br />
|U.S.A.<br />
|Member<br />
|-<br />
|Rod Taylor<br />
|Canada<br />
|Member<br />
|-<br />
|Tatsuo Ishii<br />
|Japan<br />
|Member<br />
|-<br />
<br />
|Joshua Drake<br />
|U.S.A.<br />
|Member<br />
|-<br />
<br />
|}<br />
<br />
[[Category:PGFG]]<br />
[[Category:Funds]]<br />
[[Category:Donations]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items&diff=38227PostgreSQL 16 Open Items2023-09-04T13:33:00Z<p>Alvherre: add link to ICU bug</p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
'''NOTE''': If known, please list the Owner of the open item.<br />
<br />
* {{messageLink|36a6e89689716c2ca1fae8adc8e84601a041121c.camel@j-davis.com| Unexplained behavior when ICU rules is the empty string.}}<br />
** Owner: Peter Eisentraut<br />
** ICU bug reported: https://unicode-org.atlassian.net/browse/ICU-22456<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer] [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report] [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report] [https://www.postgresql.org/message-id/flat/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA%40mail.gmail.com Another report] [https://www.postgresql.org/message-id/flat/a53cacb0-8835-57d6-31e4-4c5ef196de1a@deepbluecap.com Another report]<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/ZArVOMifjzE7f8W7%40paquier.xyz Requiring recovery.signal or standby.signal when recovering with a backup_label]<br />
** This is a rather old behavior that affects all stable branches, still not something that should be backpatched as-is.<br />
<br />
* {{messageLink|cfcca574-6967-c5ab-7dc3-2c82b6723b99@mail.ru|pg_visibility's pg_check_visible() yields false positive when working in parallel with autovacuum}}<br />
** {{messageLink|1649062270.289865713@f403.i.mail.ru|Thread with patch}} [https://commitfest.postgresql.org/43/3739/ CF Entry]<br />
<br />
* {{messageLink|1516594.1681482708@sss.pgh.pa.us|We are not compatible with newly-released LLVM 16}}<br />
** {{messageLink|CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com|Patch}}<br />
** Owner: Thomas Munro (volunteer LLVM API change chaser)<br />
<br />
* {{messageLink|CAMbWs496%2BN%3DUAjOc%3DrcD3P7B6oJe4rZw08e_TZRUsWbPxZW3Tw%40mail.gmail.com| Oversight in reparameterize_path_by_child leading to executor crash }}<br />
** Owner: Tom Lane<br />
<br />
=== Fixed issues ===<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
** Fixed at: {{PgCommitURL|8fcb32db98eda1ad2a0c0b40b1cbb5d9a7aa68f0}} and {{PgCommitURL|ffd1b6bb6f8a2ffc929699772610c6925364dbb3}} for HEAD.<br />
<br />
* [https://www.postgresql.org/message-id/flat/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Fix fseek() detection of unseekable files on WIN32]<br />
** Fixed at: {{PgCommitURL|a923e21631a29dc8b8781d7d02b5003d0df64ca3}} and {{PgCommitURL|765f5df726918bcdcfd16bcc5418e48663d1dd59}}, down to 14.<br />
<br />
* {{messageLink|CAAKRu_bETD%2BAri600h6fRjX2p8rJSeMAUp%3D_y88juqOZgouTSg%40mail.gmail.com|Can't disable autovacuum cost delay through storage parameter}}<br />
** Fixed at: {{PgCommitURL|bfac8f8bc4a44c67c9f35b5266676278e4ba1217}}, down to 11.<br />
<br />
* {{messageLink|CAJ7c6TMBTN3rcz4%3DAjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com| EPQ misbehaves for inherited/partitioned tables}}<br />
** Fixed at: {{PgCommitURL|70b42f279}}, down to 14.<br />
<br />
* {{messageLink|ZEZDj1H61ryrmY9o@msg.df7cb.de|could not extend file "base/5/3501" with FileFallocate(): Interrupted system call}}<br />
** Original commit: {{PgCommitURL|4d330a61bb1}}<br />
** Fixed at: {{PgCommitURL|0d369ac650}}<br />
<br />
* {{messageLink|20230314174521.74jl6ffqsee5mtug%40awork3.anarazel.de|DROP DATABASE is interruptible}}<br />
** Additional discussion: {{messageLink|01020187577238cf-da8c0f4a-3ab9-445a-8c74-31ef51439f30-000000%40eu-west-1.amazonses.com|"PANIC: could not open critical system index 2662" - twice}}<br />
** Fixed at: {{PgCommitURL|c66a7d75}}, down to 11.<br />
<br />
* {{messageLink|17997-a044c27aef95daf8@postgresql.org|Assertion failure when attaching a partition index}}<br />
** Fixed at: {{PgCommitURL|38ea6aa9}}, down to 11.<br />
** Fixed at: {{PgCommitURL|cfc43aeb}}, down to 11.<br />
<br />
* {{messageLink|ae46f2fb-5586-3de0-b54b-1bb0f6410ebd@inbox.ru|Issues with calculations of LimitAdditionalPins}}<br />
** Fixed at: {{PgCommitURL|bd2f46c6559}}<br />
<br />
== Non-bugs ==<br />
<br />
* {{messageLink|17862-1ab8f74b0f7b0611@postgresql.org|WindowAgg startup costs don't take into account partition bound. Can lead to incorrect use of cheap startup plans}}<br />
** {{messageLink|CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com|Patch to fix and discussion}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== resolved before 16.0 ===<br />
<br />
* {{messageLink|CAB8KJ%3Dj-ACb3H4L9a_b3ZG3iCYDW5aEu3WsPAzkm2S7JzS1Few%40mail.gmail.com| pg_stat_get_backend_subxact() uses wrong backend ID.}}<br />
** Owner: Nathan Bossart<br />
** Fixed at: {{PgCommitURL|8dfa37b797}}, {{PgCommitURL|133654a05b}}<br />
<br />
=== resolved before 16rc1 ===<br />
<br />
* {{messageLink|CAD21AoDvDmUQeJtZrau1ovnT_smN940%3DKp6mszNGK3bq9yRN6g%40mail.gmail.com| Performance degradation on concurrent COPY into a single relation in PG16 }}<br />
** Fixed at: {{PgCommitURL|d37ab378b6e773c278c14b9554a1ea23b355aab9}}<br />
<br />
=== resolved before 16beta3 ===<br />
* {{messageLink|CAJKUy5g2uZRrUDZJ8p-%3DgiwcSHVUn0c9nmdxPSY0jF0Ov8VoEA@mail.gmail.com|Assertion failure !bms_overlap(joinrel->relids, required_outer)}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|a798660ebe3ff1feb310db13b957c5cda4c8c50d}}<br />
* {{messageLink|ZJp921+nITFnvBVS@paquier.xyz|Add TLI number to name of files generated by pg_waldump --save-fullpage}}<br />
** Owner: Michael Paquier<br />
** Fixed at: {{PgCommitURL|b381d9637030c163c3b1f8a9d3de51dfc1b4ee58}}<br />
* {{messageLink|e587e2ee-7de0-88a2-10f8-c7cf001bab8c%40postgrespro.ru|psql: Add role's membership options to the \du+ command}}<br />
** Fixed at: {{PgCommitURL|0a1d2a7df852f16c452eef8a83003943125162c7}}<br />
* {{messageLink|ZKy4AdrLEfbqrxGJ@telsasoft.com|REINDEX segv on null pointer in RemoveFromWaitQueue}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|bd88404d3cda53810e0b0144713c4b1a1dd965a8}}<br />
<br />
=== resolved before 16beta2 ===<br />
* {{messageLink|17978-12f3d93a55297266@postgresql.org|wrong join order subsequent to removal of delay_upper_joins check}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|3af87736bf5d5c7bea086d962afc2bbf4f29279a}}<br />
* {{messageLink|DFBB2D25-DE97-49CA-A60E-07C881EA59A7@winand.at|Inconsistent nulling bitmap in nestloop parameters}}<br />
** Owner: Tom Lane<br />
** All known issues fixed as of {{PgCommitURL|efeb12ef0bfef0b5aa966a56bb4dbb1f936bda0c}}<br />
* {{messageLink|17976-4b638b525e9a983b@postgresql.org|join removal can no longer skip updating EquivalenceClasses}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|f4c00d138f6dea4c9d8af8ec280b7edc9b0a29e1}}<br />
* {{messageLink|CAH2-Wz%3D8Z9qY58bjm_7TAHgtW6RzZ5Ke62q5emdCEy9BAzwhmg%40mail.gmail.com|Cleaning up nbtree after logical decoding on standby work}}<br />
** Owner: Peter Geoghegan, Andres Freund<br />
** Original commit: {{PgCommitURL|61b313e4}}<br />
** Fixed at: {{PgCommitURL|d088ba5a}}<br />
* {{messageLink|CAMbWs4_tuVn9EwwMcggGiZJWWstdXX_ci8FeEU17vs+4nLgw3w@mail.gmail.com|Assert failure and wrong query results due to incorrectly removing PHV}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|9a2dbc614e6e47da3c49daacec106da32eba9467}}<br />
* {{messageLink|CAMbWs4-_vwkBij4XOQ5ukxUvLgwTm0kS5_DO9CicUeKbEfKjUw%40mail.gmail.com|Assert failure of the cross-check for nullingrels}}<br />
** Owner: Tom Lane<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** [https://commitfest.postgresql.org/43/4250/ CF Entry]<br />
** Fixed at: {{PgCommitURL|991a3df22}}<br />
* Switch to ICU for 17?<br />
** Owner: Jeff Davis<br />
** {{messageLink|82c4c816-06f6-d3e3-ba02-fca4a5cef065@enterprisedb.com|I suggest waiting until next week to commit it and then see what happens}}<br />
** [https://commitfest.postgresql.org/42/4169/ CF Entry]<br />
** Open item description not clear; if it is an open item, it's redundant with the issue "The rules for choosing default ICU locale seem pretty unfriendly". Closed. <br />
* {{messageLink|874jp9f5jo.fsf@news-spur.riddles.org.uk|The rules for choosing default ICU locale seem pretty unfriendly}}<br />
** Owner: Jeff Davis<br />
** Fixed at: {{PgCommitURL|2535c74b1a}}, {{PgCommitURL|f3a01af29b}}<br />
* {{messageLink|20230613211246.GA219055@nathanxps13|ff9618e creates cache lookup hazards with partition trees}}<br />
** Owner: Nathan Bossart, <s>Jeff Davis</s><br />
** Fixed at: {{PgCommitURL|4dbdb82513}}, {{PgCommitURL|c2122aae63}}<br />
=== resolved before 16beta1 ===<br />
* {{messageLink|CAHewXNnu7u1aT%3D%3DWjnCRa%2BSzKb6s80hvwPP_9eMvvvtdyFdqjw%40mail.gmail.com|ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3}}<br />
** Fixed at: {{PgCommitURL|d0f952691}}<br />
* {{messageLink|d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net|Possible regression setting GUCs on \connect}}<br />
** Owner: Alexander Korotkov<br />
** Discussion on reverting {{PgCommitURL|096dd80f3}}<br />
** Original commit: {{PgCommitURL|096dd80f3}}<br />
** Reverted at: {{PgCommitURL|b9a7a822723aebb16cbe7e5fb874e5124745b07e}}<br />
<br />
* Planner makes improper clause pushdown decisions due to outer-join-aware-Vars changes<br />
** {{messageLink|0b819232-4b50-f245-1c7d-c8c61bf41827@postgrespro.ru|Clause accidentally pushed down}}<br />
** {{messageLink|CAHewXNks3w_Vy9CWoVtHx1XSaeiFpsOzh-zy5eu0Khp1PtG1sA@mail.gmail.com|wrong results due to qual pushdown}}<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** Fixed at: {{PgCommitURL|9df8f903eb6758be5a19e66cdf77e922e9329c31}}<br />
<br />
* Revert {{PgCommitURL|ec386948948}}, per {{messageLink|20230330105325.y6uvpalspynf2frt@alvherre.pgsql|Re: "variable not found in subplan target list"}}<br />
** Reverted at {{PgCommitURL|5472743d9e8}}<br />
<br />
* [https://www.postgresql.org/message-id/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk%3Du_Nvd7S1UXzuQ%40mail.gmail.com ERROR: ORDER/GROUP BY expression not found in targetlist]<br />
** Fixed at: {{PgCommitURL|da5800d5fa636c6e10c9c98402d872c76aa1c8d0}}<br />
<br />
* [https://www.postgresql.org/message-id/20230212233711.GA1316@telsasoft.com various elogs hit by sqlsmith (ExecRTCheckPerms() and many prunable partitions)]<br />
** Fixed at: {{PgCommitURL|c7468c73f7b6e842a53c12eaee5578a76a8fa7a6}}<br />
<br />
* [https://www.postgresql.org/message-id/20230228235834.GC30529@telsasoft.com pg_dump: zlib compression fails for empty objects (LOs)]<br />
** Fixed at: {{PgCommitURL|00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd}}<br />
<br />
* [https://www.postgresql.org/message-id/20230227044910.GO1653@telsasoft.com pg_dump: lz4 compression uses no persistent state and writes a block header for every row]<br />
** Fixed at: {{PgCommitURL|0070b66fef21e909adb283f7faa7b1978836ad75}}<br />
<br />
* {{messageLink|3590249.1680971629@sss.pgh.pa.us|Assertion failure with parallel full hash join}}<br />
** Fixed at: {{PgCommitURL|b37d051b0e59e4324e346655a27509507813db79}}<br />
<br />
* {{messageLink|ZDDO6jaESKaBgej0@tamriel.snowman.net|De-revert "Add support for Kerberos credential delegation"}}<br />
** Owner: Stephen Frost<br />
** Original commit: {{PgCommitURL|3d4fa227bce4294ce1cc214b4a9d3b7caa3f0454}}<br />
** Revert: ({{PgCommitURL|3d03b24c350ab060bb223623bdff38835bd7afd0}}<br />
** De-Revert: {{PgCommitURL|6633cfb21691840c33816a6dacaca0b504efb895}}<br />
** Resolved at: {{PgCommitURL|f7431bca8b0138bdbce7025871560d39119565a0}}<br />
<br />
* {{messageLink|c39be3c5-c1a5-1e33-1024-16f527e251a4@enterprisedb.com|SSL tests break on non-existing system CA pool}}<br />
** Fixed at: {{PgCommitURL|0b5d1fb36adda612bd3d5d032463a6eeb0729237}}<br />
<br />
* {{messageLink|CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B%3DvVYczMugKMQ%40mail.gmail.com|VacuumUpdateCosts() logging condition incorrect for some initial values of vacuum_cost_delay}}<br />
** Fixed at: {{PgCommitURL|a9781ae11ba2fdb44a3a72c9a7ebb727140b25c5}}<br />
<br />
* {{messageLink|CA%2BhUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ%3DmeA%40mail.gmail.com|check_strxfrm_bug()}}<br />
** Owner: Thomas Munro<br />
** Fixed at: {{PgCommitURL|7d3d72b55edd1b7552a9a358991555994efab0e9}}<br />
<br />
* {{messageLink|20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de|Should we remove vacuum_defer_cleanup_age?}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|1118cd37eb61e6a2428f457a8b2026a7bb3f801a}}<br />
<br />
* {{messageLink|2fefa454-5a70-2174-ddbf-4a0e41537139@gmail.com|Add two missing tests in 035_standby_logical_decoding.pl}}<br />
** Fixed at: {{PgCommitURL|376dc820531bafcbf105fff74c5b14c23d9950af}}<br />
** Fixed at: {{PgCommitURL|a6e04b1d20c2e9cece9b64bb5b36ebfdc3a9031b}}<br />
<br />
* {{messageLink|b32bed1b-0746-9b20-1472-4bdc9ca66d52@gmail.com|Performance regression due to SQLValueFunction removal}}<br />
** Fixed at: {{PgCommitURL|d8c3106bb60e4f87be595f241e173ba3c2b7aa2c}}<br />
<br />
* {{messageLink|20230419172326.dhgyo4wrrhulovt6%40awork3.anarazel.de|pg_stat_io not tracking smgrwriteback() is confusing}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|093e5c57d506783a95dd8feddd9a3f2651e1aeba}}<br />
<br />
* {{messageLink|ZFhCyn4Gm2eu60rB@paquier.xyz|Table data compression is broken with pg_dump --compress lz4}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|1a05c1d252993b0a59c58a6daf91a2df9333044f}}<br />
<br />
* {{messageLink|94ae9bca-5ebb-1e68-bb7b-4f32e89fefbe@gmail.com|Valgrind unhappy with LZ4F code in pg_dump}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|3c18d90f8907e53c3021fca13ad046133c480e4d}}<br />
<br />
* {{messageLink|20230509190247.3rrplhdgem6su6cg@awork3.anarazel.de|walsender performance regression due to logical decoding on standby changes}}<br />
** Owner: Andres Freund<br />
** Original commit: {{PgCommitURL|e101dfac}}<br />
** Fixed at: {{PgCommitURL|bc971f4025c378ce500d86597c34b0ef996d4d8c}}<br />
<br />
== Won't Fix ==<br />
<br />
* Is it OK that WL_SOCKET_ACCEPT is less fair on Windows than on Unix (and than the coding before 16) when there are multiple server sockets configured?<br />
** {{messageLink|CA%2BhUKG%2BA2dk29hr5zRP3HVJQ-_PncNJM6HVQ7aaYLXLRBZU-xw%40mail.gmail.com|WL_SOCKET_ACCEPT fairness on Windows}} has a (blind) patch to fix that, but would need a Windows hacker to test<br />
** Owner: Thomas Munro<br />
** Original commit: {{PgCommitURL|7389aad6}}<br />
** Issue reclassified as a non-critical improvement to be [https://commitfest.postgresql.org/43/4263/ considered for 17]<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* GA: (Tentative) September 14, 2023<br />
* RC 1: August 31, 2023<br />
* Beta 3: August 10, 2023<br />
* Beta 2: June 29, 2023<br />
* Beta 1: May 25, 2023<br />
* Feature Freeze: April 8, 2023 0:00 AoE ('''Last Day to Commit Features''')<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
* [[PostgreSQL 15 Open Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items&diff=38145PostgreSQL 16 Open Items2023-08-16T11:09:55Z<p>Alvherre: fixed: Performance degradation on concurrent COPY into a single relation in PG16</p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
'''NOTE''': If known, please list the Owner of the open item.<br />
<br />
<br />
* {{messageLink|36a6e89689716c2ca1fae8adc8e84601a041121c.camel@j-davis.com| Unexplained behavior when ICU rules is the empty string.}}<br />
** Owner: Peter Eisentraut<br />
* {{messageLink|CAMbWs496%2BN%3DUAjOc%3DrcD3P7B6oJe4rZw08e_TZRUsWbPxZW3Tw%40mail.gmail.com| Oversight in reparameterize_path_by_child leading to executor crash }}<br />
** Owner: Tom Lane<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer] [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report] [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report] [https://www.postgresql.org/message-id/flat/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA%40mail.gmail.com Another report] [https://www.postgresql.org/message-id/flat/a53cacb0-8835-57d6-31e4-4c5ef196de1a@deepbluecap.com Another report]<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/ZArVOMifjzE7f8W7%40paquier.xyz Requiring recovery.signal or standby.signal when recovering with a backup_label]<br />
** This is a rather old behavior that affects all stable branches, still not something that should be backpatched as-is.<br />
<br />
* {{messageLink|cfcca574-6967-c5ab-7dc3-2c82b6723b99@mail.ru|pg_visibility's pg_check_visible() yields false positive when working in parallel with autovacuum}}<br />
** {{messageLink|1649062270.289865713@f403.i.mail.ru|Thread with patch}} [https://commitfest.postgresql.org/43/3739/ CF Entry]<br />
<br />
* {{messageLink|1516594.1681482708@sss.pgh.pa.us|We are not compatible with newly-released LLVM 16}}<br />
** {{messageLink|CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com|Patch}}<br />
** Owner: Thomas Munro (volunteer LLVM API change chaser)<br />
<br />
<br />
=== Fixed issues ===<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
** Fixed at: {{PgCommitURL|8fcb32db98eda1ad2a0c0b40b1cbb5d9a7aa68f0}} and {{PgCommitURL|ffd1b6bb6f8a2ffc929699772610c6925364dbb3}} for HEAD.<br />
<br />
* [https://www.postgresql.org/message-id/flat/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Fix fseek() detection of unseekable files on WIN32]<br />
** Fixed at: {{PgCommitURL|a923e21631a29dc8b8781d7d02b5003d0df64ca3}} and {{PgCommitURL|765f5df726918bcdcfd16bcc5418e48663d1dd59}}, down to 14.<br />
<br />
* {{messageLink|CAAKRu_bETD%2BAri600h6fRjX2p8rJSeMAUp%3D_y88juqOZgouTSg%40mail.gmail.com|Can't disable autovacuum cost delay through storage parameter}}<br />
** Fixed at: {{PgCommitURL|bfac8f8bc4a44c67c9f35b5266676278e4ba1217}}, down to 11.<br />
<br />
* {{messageLink|CAJ7c6TMBTN3rcz4%3DAjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com| EPQ misbehaves for inherited/partitioned tables}}<br />
** Fixed at: {{PgCommitURL|70b42f279}}, down to 14.<br />
<br />
* {{messageLink|ZEZDj1H61ryrmY9o@msg.df7cb.de|could not extend file "base/5/3501" with FileFallocate(): Interrupted system call}}<br />
** Original commit: {{PgCommitURL|4d330a61bb1}}<br />
** Fixed at: {{PgCommitURL|0d369ac650}}<br />
<br />
* {{messageLink|20230314174521.74jl6ffqsee5mtug%40awork3.anarazel.de|DROP DATABASE is interruptible}}<br />
** Additional discussion: {{messageLink|01020187577238cf-da8c0f4a-3ab9-445a-8c74-31ef51439f30-000000%40eu-west-1.amazonses.com|"PANIC: could not open critical system index 2662" - twice}}<br />
** Fixed at: {{PgCommitURL|c66a7d75}}, down to 11.<br />
<br />
* {{messageLink|17997-a044c27aef95daf8@postgresql.org|Assertion failure when attaching a partition index}}<br />
** Fixed at: {{PgCommitURL|38ea6aa9}}, down to 11.<br />
** Fixed at: {{PgCommitURL|cfc43aeb}}, down to 11.<br />
<br />
* {{messageLink|ae46f2fb-5586-3de0-b54b-1bb0f6410ebd@inbox.ru|Issues with calculations of LimitAdditionalPins}}<br />
** Fixed at: {{PgCommitURL|bd2f46c6559}}<br />
<br />
== Non-bugs ==<br />
<br />
* {{messageLink|17862-1ab8f74b0f7b0611@postgresql.org|WindowAgg startup costs don't take into account partition bound. Can lead to incorrect use of cheap startup plans}}<br />
** {{messageLink|CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com|Patch to fix and discussion}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== resolved before 16beta4 ===<br />
<br />
* {{messageLink|CAD21AoDvDmUQeJtZrau1ovnT_smN940%3DKp6mszNGK3bq9yRN6g%40mail.gmail.com| Performance degradation on concurrent COPY into a single relation in PG16 }}<br />
** Fixed at: {{PgCommitURL|d37ab378b6e773c278c14b9554a1ea23b355aab9}}<br />
<br />
=== resolved before 16beta3 ===<br />
* {{messageLink|CAJKUy5g2uZRrUDZJ8p-%3DgiwcSHVUn0c9nmdxPSY0jF0Ov8VoEA@mail.gmail.com|Assertion failure !bms_overlap(joinrel->relids, required_outer)}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|a798660ebe3ff1feb310db13b957c5cda4c8c50d}}<br />
* {{messageLink|ZJp921+nITFnvBVS@paquier.xyz|Add TLI number to name of files generated by pg_waldump --save-fullpage}}<br />
** Owner: Michael Paquier<br />
** Fixed at: {{PgCommitURL|b381d9637030c163c3b1f8a9d3de51dfc1b4ee58}}<br />
* {{messageLink|e587e2ee-7de0-88a2-10f8-c7cf001bab8c%40postgrespro.ru|psql: Add role's membership options to the \du+ command}}<br />
** Fixed at: {{PgCommitURL|0a1d2a7df852f16c452eef8a83003943125162c7}}<br />
* {{messageLink|ZKy4AdrLEfbqrxGJ@telsasoft.com|REINDEX segv on null pointer in RemoveFromWaitQueue}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|bd88404d3cda53810e0b0144713c4b1a1dd965a8}}<br />
<br />
=== resolved before 16beta2 ===<br />
* {{messageLink|17978-12f3d93a55297266@postgresql.org|wrong join order subsequent to removal of delay_upper_joins check}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|3af87736bf5d5c7bea086d962afc2bbf4f29279a}}<br />
* {{messageLink|DFBB2D25-DE97-49CA-A60E-07C881EA59A7@winand.at|Inconsistent nulling bitmap in nestloop parameters}}<br />
** Owner: Tom Lane<br />
** All known issues fixed as of {{PgCommitURL|efeb12ef0bfef0b5aa966a56bb4dbb1f936bda0c}}<br />
* {{messageLink|17976-4b638b525e9a983b@postgresql.org|join removal can no longer skip updating EquivalenceClasses}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|f4c00d138f6dea4c9d8af8ec280b7edc9b0a29e1}}<br />
* {{messageLink|CAH2-Wz%3D8Z9qY58bjm_7TAHgtW6RzZ5Ke62q5emdCEy9BAzwhmg%40mail.gmail.com|Cleaning up nbtree after logical decoding on standby work}}<br />
** Owner: Peter Geoghegan, Andres Freund<br />
** Original commit: {{PgCommitURL|61b313e4}}<br />
** Fixed at: {{PgCommitURL|d088ba5a}}<br />
* {{messageLink|CAMbWs4_tuVn9EwwMcggGiZJWWstdXX_ci8FeEU17vs+4nLgw3w@mail.gmail.com|Assert failure and wrong query results due to incorrectly removing PHV}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|9a2dbc614e6e47da3c49daacec106da32eba9467}}<br />
* {{messageLink|CAMbWs4-_vwkBij4XOQ5ukxUvLgwTm0kS5_DO9CicUeKbEfKjUw%40mail.gmail.com|Assert failure of the cross-check for nullingrels}}<br />
** Owner: Tom Lane<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** [https://commitfest.postgresql.org/43/4250/ CF Entry]<br />
** Fixed at: {{PgCommitURL|991a3df22}}<br />
* Switch to ICU for 17?<br />
** Owner: Jeff Davis<br />
** {{messageLink|82c4c816-06f6-d3e3-ba02-fca4a5cef065@enterprisedb.com|I suggest waiting until next week to commit it and then see what happens}}<br />
** [https://commitfest.postgresql.org/42/4169/ CF Entry]<br />
** Open item description not clear; if it is an open item, it's redundant with the issue "The rules for choosing default ICU locale seem pretty unfriendly". Closed. <br />
* {{messageLink|874jp9f5jo.fsf@news-spur.riddles.org.uk|The rules for choosing default ICU locale seem pretty unfriendly}}<br />
** Owner: Jeff Davis<br />
** Fixed at: {{PgCommitURL|2535c74b1a}}, {{PgCommitURL|f3a01af29b}}<br />
* {{messageLink|20230613211246.GA219055@nathanxps13|ff9618e creates cache lookup hazards with partition trees}}<br />
** Owner: Nathan Bossart, <s>Jeff Davis</s><br />
** Fixed at: {{PgCommitURL|4dbdb82513}}, {{PgCommitURL|c2122aae63}}<br />
=== resolved before 16beta1 ===<br />
* {{messageLink|CAHewXNnu7u1aT%3D%3DWjnCRa%2BSzKb6s80hvwPP_9eMvvvtdyFdqjw%40mail.gmail.com|ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3}}<br />
** Fixed at: {{PgCommitURL|d0f952691}}<br />
* {{messageLink|d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net|Possible regression setting GUCs on \connect}}<br />
** Owner: Alexander Korotkov<br />
** Discussion on reverting {{PgCommitURL|096dd80f3}}<br />
** Original commit: {{PgCommitURL|096dd80f3}}<br />
** Reverted at: {{PgCommitURL|b9a7a822723aebb16cbe7e5fb874e5124745b07e}}<br />
<br />
* Planner makes improper clause pushdown decisions due to outer-join-aware-Vars changes<br />
** {{messageLink|0b819232-4b50-f245-1c7d-c8c61bf41827@postgrespro.ru|Clause accidentally pushed down}}<br />
** {{messageLink|CAHewXNks3w_Vy9CWoVtHx1XSaeiFpsOzh-zy5eu0Khp1PtG1sA@mail.gmail.com|wrong results due to qual pushdown}}<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** Fixed at: {{PgCommitURL|9df8f903eb6758be5a19e66cdf77e922e9329c31}}<br />
<br />
* Revert {{PgCommitURL|ec386948948}}, per {{messageLink|20230330105325.y6uvpalspynf2frt@alvherre.pgsql|Re: "variable not found in subplan target list"}}<br />
** Reverted at {{PgCommitURL|5472743d9e8}}<br />
<br />
* [https://www.postgresql.org/message-id/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk%3Du_Nvd7S1UXzuQ%40mail.gmail.com ERROR: ORDER/GROUP BY expression not found in targetlist]<br />
** Fixed at: {{PgCommitURL|da5800d5fa636c6e10c9c98402d872c76aa1c8d0}}<br />
<br />
* [https://www.postgresql.org/message-id/20230212233711.GA1316@telsasoft.com various elogs hit by sqlsmith (ExecRTCheckPerms() and many prunable partitions)]<br />
** Fixed at: {{PgCommitURL|c7468c73f7b6e842a53c12eaee5578a76a8fa7a6}}<br />
<br />
* [https://www.postgresql.org/message-id/20230228235834.GC30529@telsasoft.com pg_dump: zlib compression fails for empty objects (LOs)]<br />
** Fixed at: {{PgCommitURL|00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd}}<br />
<br />
* [https://www.postgresql.org/message-id/20230227044910.GO1653@telsasoft.com pg_dump: lz4 compression uses no persistent state and writes a block header for every row]<br />
** Fixed at: {{PgCommitURL|0070b66fef21e909adb283f7faa7b1978836ad75}}<br />
<br />
* {{messageLink|3590249.1680971629@sss.pgh.pa.us|Assertion failure with parallel full hash join}}<br />
** Fixed at: {{PgCommitURL|b37d051b0e59e4324e346655a27509507813db79}}<br />
<br />
* {{messageLink|ZDDO6jaESKaBgej0@tamriel.snowman.net|De-revert "Add support for Kerberos credential delegation"}}<br />
** Owner: Stephen Frost<br />
** Original commit: {{PgCommitURL|3d4fa227bce4294ce1cc214b4a9d3b7caa3f0454}}<br />
** Revert: ({{PgCommitURL|3d03b24c350ab060bb223623bdff38835bd7afd0}}<br />
** De-Revert: {{PgCommitURL|6633cfb21691840c33816a6dacaca0b504efb895}}<br />
** Resolved at: {{PgCommitURL|f7431bca8b0138bdbce7025871560d39119565a0}}<br />
<br />
* {{messageLink|c39be3c5-c1a5-1e33-1024-16f527e251a4@enterprisedb.com|SSL tests break on non-existing system CA pool}}<br />
** Fixed at: {{PgCommitURL|0b5d1fb36adda612bd3d5d032463a6eeb0729237}}<br />
<br />
* {{messageLink|CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B%3DvVYczMugKMQ%40mail.gmail.com|VacuumUpdateCosts() logging condition incorrect for some initial values of vacuum_cost_delay}}<br />
** Fixed at: {{PgCommitURL|a9781ae11ba2fdb44a3a72c9a7ebb727140b25c5}}<br />
<br />
* {{messageLink|CA%2BhUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ%3DmeA%40mail.gmail.com|check_strxfrm_bug()}}<br />
** Owner: Thomas Munro<br />
** Fixed at: {{PgCommitURL|7d3d72b55edd1b7552a9a358991555994efab0e9}}<br />
<br />
* {{messageLink|20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de|Should we remove vacuum_defer_cleanup_age?}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|1118cd37eb61e6a2428f457a8b2026a7bb3f801a}}<br />
<br />
* {{messageLink|2fefa454-5a70-2174-ddbf-4a0e41537139@gmail.com|Add two missing tests in 035_standby_logical_decoding.pl}}<br />
** Fixed at: {{PgCommitURL|376dc820531bafcbf105fff74c5b14c23d9950af}}<br />
** Fixed at: {{PgCommitURL|a6e04b1d20c2e9cece9b64bb5b36ebfdc3a9031b}}<br />
<br />
* {{messageLink|b32bed1b-0746-9b20-1472-4bdc9ca66d52@gmail.com|Performance regression due to SQLValueFunction removal}}<br />
** Fixed at: {{PgCommitURL|d8c3106bb60e4f87be595f241e173ba3c2b7aa2c}}<br />
<br />
* {{messageLink|20230419172326.dhgyo4wrrhulovt6%40awork3.anarazel.de|pg_stat_io not tracking smgrwriteback() is confusing}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|093e5c57d506783a95dd8feddd9a3f2651e1aeba}}<br />
<br />
* {{messageLink|ZFhCyn4Gm2eu60rB@paquier.xyz|Table data compression is broken with pg_dump --compress lz4}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|1a05c1d252993b0a59c58a6daf91a2df9333044f}}<br />
<br />
* {{messageLink|94ae9bca-5ebb-1e68-bb7b-4f32e89fefbe@gmail.com|Valgrind unhappy with LZ4F code in pg_dump}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|3c18d90f8907e53c3021fca13ad046133c480e4d}}<br />
<br />
* {{messageLink|20230509190247.3rrplhdgem6su6cg@awork3.anarazel.de|walsender performance regression due to logical decoding on standby changes}}<br />
** Owner: Andres Freund<br />
** Original commit: {{PgCommitURL|e101dfac}}<br />
** Fixed at: {{PgCommitURL|bc971f4025c378ce500d86597c34b0ef996d4d8c}}<br />
<br />
== Won't Fix ==<br />
<br />
* Is it OK that WL_SOCKET_ACCEPT is less fair on Windows than on Unix (and than the coding before 16) when there are multiple server sockets configured?<br />
** {{messageLink|CA%2BhUKG%2BA2dk29hr5zRP3HVJQ-_PncNJM6HVQ7aaYLXLRBZU-xw%40mail.gmail.com|WL_SOCKET_ACCEPT fairness on Windows}} has a (blind) patch to fix that, but would need a Windows hacker to test<br />
** Owner: Thomas Munro<br />
** Original commit: {{PgCommitURL|7389aad6}}<br />
** Issue reclassified as a non-critical improvement to be [https://commitfest.postgresql.org/43/4263/ considered for 17]<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* Beta 3: August 10, 2023<br />
* Beta 2: June 29, 2023<br />
* Beta 1: May 25, 2023<br />
* Feature Freeze: April 8, 2023 0:00 AoE ('''Last Day to Commit Features''')<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
* [[PostgreSQL 15 Open Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Trademark_issues&diff=38097Trademark issues2023-07-27T10:45:42Z<p>Alvherre: /* Timeline */ link to new article</p>
<hr />
<div>The PostgreSQL community is currently battling a trademark issue with Fundación PostgreSQL. This article details the timeline of the dispute as well as various courts' findings.<br />
<br />
== Actors ==<br />
<br />
The following parties are involved in this dispute:<br />
<br />
* Core Team ([https://www.postgresql.org/developer/core/ Official Site])<br />
* Fundación PostgreSQL ([https://postgresql.fund/ Official Site])<br />
* PGCA: PostgreSQL Community Association ([https://www.postgres.ca/ Official Site])<br />
* PGEU: PostgreSQL Europe ([https://www.postgresql.eu/ Official Site])<br />
<br />
The following legal entities are presiding over this dispute:<br />
<br />
* EUIPO: European Union Intellectual Property Office ([https://euipo.europa.eu/ Official Site])<br />
* OEPM: <span lang="es">Oficina Española de Patentes y Marcas</span> ([https://oepm.es/ Official Site])<br />
* USPTO: United States Patent and Trademark Office ([https://www.uspto.gov/ Official Site])<br />
<br />
== Timeline ==<br />
<br />
* 2003-07-17, PGCA's '''POSTGRESQL''' trademark is registered in Canada (originally by PostgreSQL Inc.).<br />
* 2011-05-30, The ''PostgreSQL Community Association of Canada'' (''PGCAC'') is registered as an NPO in Canada to steward the PostgreSQL Projects assets (domain names, trademarks etc) at the request of the PostgreSQL Core Team.<br />
* 2018-04-17, PGEU's '''POSTGRESQL CONFERENCE''' trademark is registered in the EU/UK.<br />
* 2018-04-20, PGEU's '''POSTGRES CONFERENCE''' trademark is registered in the EU/UK.<br />
* 2018-08-15, PGCA's '''POSTGRES''' trademark is registered in the EU/UK.<br />
* 2018-08-15, PGCA's '''POSTGRESQL''' trademark is registered in the EU/UK.<br />
* 2018-08-15, PGCA's '''POSTGRES''' trademark is registered in the USA.<br />
* 2018-08-15, PGCA's '''POSTGRESQL''' trademark is registered in the USA.<br />
* 2020-04-27, [https://euipo.europa.eu/eSearch/#details/trademarks/W01534836 Fundación PostgreSQL registers EU trademark for '''POSTGRESQL'''] (EUIPO)<br />
* 2020-04-27, [https://euipo.europa.eu/eSearch/#details/trademarks/W01558723 Fundación PostgreSQL registers EU trademark for '''POSTGRESQL COMMUNITY'''] (EUIPO)<br />
* 2020-10-06, PGCA files EUIPO opposition for '''POSTGRESQL''' trademark<br />
* 2020-10-06, PGEU files EUIPO opposition for '''POSTGRESQL''' trademark<br />
* 2020-10-20, [http://consultas2.oepm.es/ceo/jsp/busqueda/consultaExterna.xhtml?numExp=M4089693# Fundación PostgreSQL registers Spanish trademark for '''POSTGRES'''] (OEPM)<br />
* 2020-11-20, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/is-it-time-to-modernize-postgresql-core/ Is it time to modernize the processes, structure and governance of the PostgreSQL Core Team?] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/is-it-time-to-modernize-postgresql-core/ Archive link, including changes])<br />
* 2021-02-16, [https://euipo.europa.eu/eSearch/#details/trademarks/W01598034 Fundación PostgreSQL registers EU trademark for "POSTGRES"] (EUIPO)<br />
* 2021-03-21, PGCA files EUIPO opposition for '''POSTGRESQL COMMUNITY''' trademark<br />
* 2021-03-21, PGEU files EUIPO opposition for '''POSTGRESQL COMMUNITY''' trademark<br />
* 2021-06-25, ''Fundación PostgreSQL'' registers Spanish trademark for '''POSTGRES'''<br />
* 2021-09-13, ''Core Team'' and PGCA publish article: [https://www.postgresql.org/about/news/trademark-actions-against-the-postgresql-community-2302/ Trademark Actions Against the PostgreSQL Community] (PostgreSQL.org)<br />
* 2021-09-14, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/respecting-majority-questioning-status-quo-as-a-minority/ Respecting the majority, questioning the status quo as a minority] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/respecting-majority-questioning-status-quo-as-a-minority/ Archive link, including changes])<br />
** This blog posting includes the following statement: "we have informed the Core Team that effective immediately Fundación PostgreSQL has unanimously passed a resolution to start the process to transfer, permanently and irrevocably, all PostgreSQL-related trademarks and domain names to the PostgreSQL Association of Canada, with no conditions or costs attached" ([https://web.archive.org/web/20210914223639/https://postgresql.fund/blog/respecting-majority-questioning-status-quo-as-a-minority/ 2021-09-14])<br />
* 2021-09-21, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/postgres-core-team-attacks-postgres-community/ Postgres Core Team launches unprecedented attack against the Postgres Community] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/postgres-core-team-attacks-postgres-community/ Archive link, including changes])<br />
* 2021-10-20, PGCA files EUIPO opposition for '''POSTGRES''' trademark<br />
* 2021-10-20, PGEU files EUIPO opposition for '''POSTGRES''' trademark<br />
* 2022-06-10, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/re-update-on-the-trademark-actions/ Re: Update on the Trademark Actions] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/re-update-on-the-trademark-actions/ Archive link, including changes])<br />
* 2022-06-23, the Spanish courts [https://www.postgres.ca/#2022-06-23 invalidate] the infringing '''POSTGRESQL''' and '''POSTGRESQL COMMUNITY''' trademark registrations.<br />
* 2022-10-05, ''Fundación PostgreSQL'' publishes blog post: [https://postgresql.fund/blog/postgres-trademarks-disagreement-proposing-a-solution/ PostgreSQL Trademarks Disagreement: Proposing a Solution] (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/postgres-trademarks-disagreement-proposing-a-solution/ Archive link, including changes])<br />
* 2023-05-09, The USPTO gives final rejection of ''Fundación PostgreSQL'''s '''POSTGRES''' trademark in the USA<br />
* 2023-05-09, The USPTO gives final rejection of ''Fundación PostgreSQL'''s '''POSTGRESQL''' trademark in the USA<br />
* 2023-05-09, The USPTO gives final rejection of ''Fundación PostgreSQL'''s '''POSTGRESQL COMMUNITY''' trademark in the USA<br />
* 2023-07-11, PGCA publishes article: [https://www.postgresql.org/about/news/update-on-continued-trademark-actions-against-the-postgresql-community-2673/ Update on Continued Trademark Actions Against the PostgreSQL Community] (PostgreSQL.org)<br />
* 2023-07-24, ''Fundación PostgreSQL'' publishes blog post claiming that ''PostgreSQL'' is trying to [https://postgresql.fund/blog/the-postgres-core-team-tries-to-shut-down-a-postgres-community-conference/ "shut down"] the Ibiza conference. (Fundación PostgreSQL) ([https://web.archive.org/web/*/https://postgresql.fund/blog/the-postgres-core-team-tries-to-shut-down-a-postgres-community-conference/ Archive link, including changes])<br />
* 2023-07-27, PGCA publishes article: [https://www.postgresql.org/about/news/setting-the-record-straight-more-updates-on-a-trademark-dispute-2682/ Setting the record straight: More updates on a trademark dispute] (PostgreSQL.org)<br />
<br />
== See Also ==<br />
<br />
* [https://news.ycombinator.com/item?id=28512274 Hacker News discussion]<br />
* [https://lwn.net/Articles/869108/ LWN discussion]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items&diff=37996PostgreSQL 16 Open Items2023-06-19T13:13:36Z<p>Alvherre: /* Open Issues */ Nathan can take over "ff9618e creates cache lookup hazards with partition trees"</p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
'''NOTE''': If known, please list the Owner of the open item.<br />
<br />
* Switch to ICU for 17?<br />
** Owner: Jeff Davis<br />
** {{messageLink|82c4c816-06f6-d3e3-ba02-fca4a5cef065@enterprisedb.com|I suggest waiting until next week to commit it and then see what happens}}<br />
** [https://commitfest.postgresql.org/42/4169/ CF Entry]<br />
* {{messageLink|e587e2ee-7de0-88a2-10f8-c7cf001bab8c%40postgrespro.ru|psql: Add role's membership options to the \du+ command}}<br />
** [https://commitfest.postgresql.org/43/4116/ CF Entry]<br />
** NOTE: This is not a committed feature for v16<br />
* {{messageLink|874jp9f5jo.fsf@news-spur.riddles.org.uk|The rules for choosing default ICU locale seem pretty unfriendly}}<br />
** Owner: Jeff Davis<br />
* {{messageLink|ZEZDj1H61ryrmY9o@msg.df7cb.de|could not extend file "base/5/3501" with FileFallocate(): Interrupted system call}}<br />
** Owner: Andres Freund<br />
** Original commit: {{PgCommitURL|4d330a61bb1}}<br />
* {{messageLink|DFBB2D25-DE97-49CA-A60E-07C881EA59A7@winand.at|Inconsistent nulling bitmap in nestloop parameters}}<br />
** Owner: Tom Lane<br />
* {{messageLink|20230613211246.GA219055@nathanxps13|ff9618e creates cache lookup hazards with partition trees}}<br />
** Owner: Nathan Bossart, <s>Jeff Davis</s><br />
* {{messageLink|17976-4b638b525e9a983b@postgresql.org|join removal can no longer skip updating EquivalenceClasses}}<br />
** Owner: Tom Lane<br />
* {{messageLink|17978-12f3d93a55297266@postgresql.org|wrong join order subsequent to removal of delay_upper_joins check}}<br />
** Owner: Tom Lane<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
* [https://www.postgresql.org/message-id/268fd337-8bb7-92e6-0da2-416c022c11f3%40enterprisedb.com Reconsider a utility_query_id GUC to control if query jumbling of utilities can go through the past string-only mode and the new mode?]<br />
** Potential owner: Michael Paquier<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer] [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report] [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report] [https://www.postgresql.org/message-id/flat/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA%40mail.gmail.com Another report] [https://www.postgresql.org/message-id/flat/a53cacb0-8835-57d6-31e4-4c5ef196de1a@deepbluecap.com Another report]<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/ZArVOMifjzE7f8W7%40paquier.xyz Requiring recovery.signal or standby.signal when recovering with a backup_label]<br />
** This is a rather old behavior that affects all stable branches, still not something that should be backpatched as-is.<br />
<br />
* {{messageLink|cfcca574-6967-c5ab-7dc3-2c82b6723b99@mail.ru|pg_visibility's pg_check_visible() yields false positive when working in parallel with autovacuum}}<br />
** {{messageLink|1649062270.289865713@f403.i.mail.ru|Thread with patch}} [https://commitfest.postgresql.org/43/3739/ CF Entry]<br />
<br />
* {{messageLink|1516594.1681482708@sss.pgh.pa.us|We are not compatible with newly-released LLVM 16}}<br />
** {{messageLink|CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com|Patch}}<br />
** Owner: Thomas Munro (volunteer LLVM API change chaser)<br />
<br />
* {{messageLink|20230314174521.74jl6ffqsee5mtug%40awork3.anarazel.de|DROP DATABASE is interruptible}}<br />
** Additional discussion: {{messageLink|01020187577238cf-da8c0f4a-3ab9-445a-8c74-31ef51439f30-000000%40eu-west-1.amazonses.com|"PANIC: could not open critical system index 2662" - twice}}<br />
<br />
=== Fixed issues ===<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
** Fixed at: {{PgCommitURL|8fcb32db98eda1ad2a0c0b40b1cbb5d9a7aa68f0}} and {{PgCommitURL|ffd1b6bb6f8a2ffc929699772610c6925364dbb3}} for HEAD.<br />
<br />
* [https://www.postgresql.org/message-id/flat/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Fix fseek() detection of unseekable files on WIN32]<br />
** Fixed at: {{PgCommitURL|a923e21631a29dc8b8781d7d02b5003d0df64ca3}} and {{PgCommitURL|765f5df726918bcdcfd16bcc5418e48663d1dd59}}, down to 14.<br />
<br />
* {{messageLink|CAAKRu_bETD%2BAri600h6fRjX2p8rJSeMAUp%3D_y88juqOZgouTSg%40mail.gmail.com|Can't disable autovacuum cost delay through storage parameter}}<br />
** Fixed at: {{PgCommitURL|bfac8f8bc4a44c67c9f35b5266676278e4ba1217}}, down to 11.<br />
<br />
* {{messageLink|CAJ7c6TMBTN3rcz4%3DAjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com| EPQ misbehaves for inherited/partitioned tables}}<br />
** Fixed at: {{PgCommitURL|70b42f279}}, down to 14.<br />
<br />
== Non-bugs ==<br />
<br />
* {{messageLink|17862-1ab8f74b0f7b0611@postgresql.org|WindowAgg startup costs don't take into account partition bound. Can lead to incorrect use of cheap startup plans}}<br />
** {{messageLink|CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com|Patch to fix and discussion}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== resolved before 16beta2 ===<br />
* {{messageLink|CAH2-Wz%3D8Z9qY58bjm_7TAHgtW6RzZ5Ke62q5emdCEy9BAzwhmg%40mail.gmail.com|Cleaning up nbtree after logical decoding on standby work}}<br />
** Owner: Peter Geoghegan, Andres Freund<br />
** Original commit: {{PgCommitURL|61b313e4}}<br />
** Fixed at: {{PgCommitURL|d088ba5a}}<br />
* {{messageLink|CAMbWs4_tuVn9EwwMcggGiZJWWstdXX_ci8FeEU17vs+4nLgw3w@mail.gmail.com|Assert failure and wrong query results due to incorrectly removing PHV}}<br />
** Owner: Tom Lane<br />
** Fixed at: {{PgCommitURL|9a2dbc614e6e47da3c49daacec106da32eba9467}}<br />
* {{messageLink|CAMbWs4-_vwkBij4XOQ5ukxUvLgwTm0kS5_DO9CicUeKbEfKjUw%40mail.gmail.com|Assert failure of the cross-check for nullingrels}}<br />
** Owner: Tom Lane<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** [https://commitfest.postgresql.org/43/4250/ CF Entry]<br />
** Fixed at: {{PgCommitURL|991a3df22}}<br />
<br />
=== resolved before 16beta1 ===<br />
* {{messageLink|CAHewXNnu7u1aT%3D%3DWjnCRa%2BSzKb6s80hvwPP_9eMvvvtdyFdqjw%40mail.gmail.com|ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3}}<br />
** Fixed at: {{PgCommitURL|d0f952691}}<br />
* {{messageLink|d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net|Possible regression setting GUCs on \connect}}<br />
** Owner: Alexander Korotkov<br />
** Discussion on reverting {{PgCommitURL|096dd80f3}}<br />
** Original commit: {{PgCommitURL|096dd80f3}}<br />
** Reverted at: {{PgCommitURL|b9a7a822723aebb16cbe7e5fb874e5124745b07e}}<br />
<br />
* Planner makes improper clause pushdown decisions due to outer-join-aware-Vars changes<br />
** {{messageLink|0b819232-4b50-f245-1c7d-c8c61bf41827@postgrespro.ru|Clause accidentally pushed down}}<br />
** {{messageLink|CAHewXNks3w_Vy9CWoVtHx1XSaeiFpsOzh-zy5eu0Khp1PtG1sA@mail.gmail.com|wrong results due to qual pushdown}}<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** Fixed at: {{PgCommitURL|9df8f903eb6758be5a19e66cdf77e922e9329c31}}<br />
<br />
* Revert {{PgCommitURL|ec386948948}}, per {{messageLink|20230330105325.y6uvpalspynf2frt@alvherre.pgsql|Re: "variable not found in subplan target list"}}<br />
** Reverted at {{PgCommitURL|5472743d9e8}}<br />
<br />
* [https://www.postgresql.org/message-id/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk%3Du_Nvd7S1UXzuQ%40mail.gmail.com ERROR: ORDER/GROUP BY expression not found in targetlist]<br />
** Fixed at: {{PgCommitURL|da5800d5fa636c6e10c9c98402d872c76aa1c8d0}}<br />
<br />
* [https://www.postgresql.org/message-id/20230212233711.GA1316@telsasoft.com various elogs hit by sqlsmith (ExecRTCheckPerms() and many prunable partitions)]<br />
** Fixed at: {{PgCommitURL|c7468c73f7b6e842a53c12eaee5578a76a8fa7a6}}<br />
<br />
* [https://www.postgresql.org/message-id/20230228235834.GC30529@telsasoft.com pg_dump: zlib compression fails for empty objects (LOs)]<br />
** Fixed at: {{PgCommitURL|00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd}}<br />
<br />
* [https://www.postgresql.org/message-id/20230227044910.GO1653@telsasoft.com pg_dump: lz4 compression uses no persistent state and writes a block header for every row]<br />
** Fixed at: {{PgCommitURL|0070b66fef21e909adb283f7faa7b1978836ad75}}<br />
<br />
* {{messageLink|3590249.1680971629@sss.pgh.pa.us|Assertion failure with parallel full hash join}}<br />
** Fixed at: {{PgCommitURL|b37d051b0e59e4324e346655a27509507813db79}}<br />
<br />
* {{messageLink|ZDDO6jaESKaBgej0@tamriel.snowman.net|De-revert "Add support for Kerberos credential delegation"}}<br />
** Owner: Stephen Frost<br />
** Original commit: {{PgCommitURL|3d4fa227bce4294ce1cc214b4a9d3b7caa3f0454}}<br />
** Revert: ({{PgCommitURL|3d03b24c350ab060bb223623bdff38835bd7afd0}}<br />
** De-Revert: {{PgCommitURL|6633cfb21691840c33816a6dacaca0b504efb895}}<br />
** Resolved at: {{PgCommitURL|f7431bca8b0138bdbce7025871560d39119565a0}}<br />
<br />
* {{messageLink|c39be3c5-c1a5-1e33-1024-16f527e251a4@enterprisedb.com|SSL tests break on non-existing system CA pool}}<br />
** Fixed at: {{PgCommitURL|0b5d1fb36adda612bd3d5d032463a6eeb0729237}}<br />
<br />
* {{messageLink|CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B%3DvVYczMugKMQ%40mail.gmail.com|VacuumUpdateCosts() logging condition incorrect for some initial values of vacuum_cost_delay}}<br />
** Fixed at: {{PgCommitURL|a9781ae11ba2fdb44a3a72c9a7ebb727140b25c5}}<br />
<br />
* {{messageLink|CA%2BhUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ%3DmeA%40mail.gmail.com|check_strxfrm_bug()}}<br />
** Owner: Thomas Munro<br />
** Fixed at: {{PgCommitURL|7d3d72b55edd1b7552a9a358991555994efab0e9}}<br />
<br />
* {{messageLink|20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de|Should we remove vacuum_defer_cleanup_age?}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|1118cd37eb61e6a2428f457a8b2026a7bb3f801a}}<br />
<br />
* {{messageLink|2fefa454-5a70-2174-ddbf-4a0e41537139@gmail.com|Add two missing tests in 035_standby_logical_decoding.pl}}<br />
** Fixed at: {{PgCommitURL|376dc820531bafcbf105fff74c5b14c23d9950af}}<br />
** Fixed at: {{PgCommitURL|a6e04b1d20c2e9cece9b64bb5b36ebfdc3a9031b}}<br />
<br />
* {{messageLink|b32bed1b-0746-9b20-1472-4bdc9ca66d52@gmail.com|Performance regression due to SQLValueFunction removal}}<br />
** Fixed at: {{PgCommitURL|d8c3106bb60e4f87be595f241e173ba3c2b7aa2c}}<br />
<br />
* {{messageLink|20230419172326.dhgyo4wrrhulovt6%40awork3.anarazel.de|pg_stat_io not tracking smgrwriteback() is confusing}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|093e5c57d506783a95dd8feddd9a3f2651e1aeba}}<br />
<br />
* {{messageLink|ZFhCyn4Gm2eu60rB@paquier.xyz|Table data compression is broken with pg_dump --compress lz4}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|1a05c1d252993b0a59c58a6daf91a2df9333044f}}<br />
<br />
* {{messageLink|94ae9bca-5ebb-1e68-bb7b-4f32e89fefbe@gmail.com|Valgrind unhappy with LZ4F code in pg_dump}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|3c18d90f8907e53c3021fca13ad046133c480e4d}}<br />
<br />
* {{messageLink|20230509190247.3rrplhdgem6su6cg@awork3.anarazel.de|walsender performance regression due to logical decoding on standby changes}}<br />
** Owner: Andres Freund<br />
** Original commit: {{PgCommitURL|e101dfac}}<br />
** Fixed at: {{PgCommitURL|bc971f4025c378ce500d86597c34b0ef996d4d8c}}<br />
<br />
== Won't Fix ==<br />
<br />
* Is it OK that WL_SOCKET_ACCEPT is less fair on Windows than on Unix (and than the coding before 16) when there are multiple server sockets configured?<br />
** {{messageLink|CA%2BhUKG%2BA2dk29hr5zRP3HVJQ-_PncNJM6HVQ7aaYLXLRBZU-xw%40mail.gmail.com|WL_SOCKET_ACCEPT fairness on Windows}} has a (blind) patch to fix that, but would need a Windows hacker to test<br />
** Owner: Thomas Munro<br />
** Original commit: {{PgCommitURL|7389aad6}}<br />
** Issue reclassified as a non-critical improvement to be [https://commitfest.postgresql.org/43/4263/ considered for 17]<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* Beta 2: June 29, 2023<br />
* Beta 1: May 25, 2023<br />
* Feature Freeze: April 8, 2023 0:00 AoE ('''Last Day to Commit Features''')<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
* [[PostgreSQL 15 Open Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items&diff=37949PostgreSQL 16 Open Items2023-06-06T19:57:04Z<p>Alvherre: /* Open Issues */ add: cleaning up nbtree after ...</p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
'''NOTE''': If known, please list the Owner of the open item.<br />
<br />
* Switch to ICU for 17?<br />
** Owner: Jeff Davis<br />
** {{messageLink|82c4c816-06f6-d3e3-ba02-fca4a5cef065@enterprisedb.com|I suggest waiting until next week to commit it and then see what happens}}<br />
** [https://commitfest.postgresql.org/42/4169/ CF Entry]<br />
* {{messageLink|e587e2ee-7de0-88a2-10f8-c7cf001bab8c%40postgrespro.ru|psql: Add role's membership options to the \du+ command}}<br />
** [https://commitfest.postgresql.org/43/4116/ CF Entry]<br />
** NOTE: This is not a committed feature for v16<br />
* {{messageLink|874jp9f5jo.fsf@news-spur.riddles.org.uk|The rules for choosing default ICU locale seem pretty unfriendly}}<br />
** Owner: Jeff Davis<br />
* {{messageLink|CAMbWs4-_vwkBij4XOQ5ukxUvLgwTm0kS5_DO9CicUeKbEfKjUw%40mail.gmail.com|Assert failure of the cross-check for nullingrels}}<br />
** Owner: Tom Lane<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** [https://commitfest.postgresql.org/43/4250/ CF Entry]<br />
* {{messageLink|ZEZDj1H61ryrmY9o@msg.df7cb.de|could not extend file "base/5/3501" with FileFallocate(): Interrupted system call}}<br />
<br />
* {{messageLink|CAH2-Wz%3D8Z9qY58bjm_7TAHgtW6RzZ5Ke62q5emdCEy9BAzwhmg%40mail.gmail.com|Cleaning up nbtree after logical decoding on standby work}}<br />
** Owner: Peter Geoghegan, Andres Freund<br />
** Original commit: {{PgCommitURL|61b313e4}}<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
* [https://www.postgresql.org/message-id/268fd337-8bb7-92e6-0da2-416c022c11f3%40enterprisedb.com Reconsider a utility_query_id GUC to control if query jumbling of utilities can go through the past string-only mode and the new mode?]<br />
** Potential owner: Michael Paquier<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer] [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report] [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report] [https://www.postgresql.org/message-id/flat/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA%40mail.gmail.com Another report] [https://www.postgresql.org/message-id/flat/a53cacb0-8835-57d6-31e4-4c5ef196de1a@deepbluecap.com Another report]<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/ZArVOMifjzE7f8W7%40paquier.xyz Requiring recovery.signal or standby.signal when recovering with a backup_label]<br />
** This is a rather old behavior that affects all stable branches, still not something that should be backpatched as-is.<br />
<br />
* {{messageLink|cfcca574-6967-c5ab-7dc3-2c82b6723b99@mail.ru|pg_visibility's pg_check_visible() yields false positive when working in parallel with autovacuum}}<br />
** {{messageLink|1649062270.289865713@f403.i.mail.ru|Thread with patch}} [https://commitfest.postgresql.org/43/3739/ CF Entry]<br />
<br />
* {{messageLink|1516594.1681482708@sss.pgh.pa.us|We are not compatible with newly-released LLVM 16}}<br />
** {{messageLink|CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com|Patch}}<br />
** Owner: Thomas Munro (volunteer LLVM API change chaser)<br />
<br />
* {{messageLink|20230314174521.74jl6ffqsee5mtug%40awork3.anarazel.de|DROP DATABASE is interruptible}}<br />
** Additional discussion: {{messageLink|01020187577238cf-da8c0f4a-3ab9-445a-8c74-31ef51439f30-000000%40eu-west-1.amazonses.com|"PANIC: could not open critical system index 2662" - twice}}<br />
<br />
=== Fixed issues ===<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
** Fixed at: {{PgCommitURL|8fcb32db98eda1ad2a0c0b40b1cbb5d9a7aa68f0}} and {{PgCommitURL|ffd1b6bb6f8a2ffc929699772610c6925364dbb3}} for HEAD.<br />
<br />
* [https://www.postgresql.org/message-id/flat/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Fix fseek() detection of unseekable files on WIN32]<br />
** Fixed at: {{PgCommitURL|a923e21631a29dc8b8781d7d02b5003d0df64ca3}} and {{PgCommitURL|765f5df726918bcdcfd16bcc5418e48663d1dd59}}, down to 14.<br />
<br />
* {{messageLink|CAAKRu_bETD%2BAri600h6fRjX2p8rJSeMAUp%3D_y88juqOZgouTSg%40mail.gmail.com|Can't disable autovacuum cost delay through storage parameter}}<br />
** Fixed at: {{PgCommitURL|bfac8f8bc4a44c67c9f35b5266676278e4ba1217}}, down to 11.<br />
<br />
* {{messageLink|CAJ7c6TMBTN3rcz4%3DAjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com| EPQ misbehaves for inherited/partitioned tables}}<br />
** Fixed at: {{PgCommitURL|70b42f279}}, down to 14.<br />
<br />
== Non-bugs ==<br />
<br />
* {{messageLink|17862-1ab8f74b0f7b0611@postgresql.org|WindowAgg startup costs don't take into account partition bound. Can lead to incorrect use of cheap startup plans}}<br />
** {{messageLink|CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com|Patch to fix and discussion}}<br />
<br />
== Resolved Issues ==<br />
<br />
=== resolved before 16beta2 ===<br />
<br />
=== resolved before 16beta1 ===<br />
<br />
* {{messageLink|CAHewXNnu7u1aT%3D%3DWjnCRa%2BSzKb6s80hvwPP_9eMvvvtdyFdqjw%40mail.gmail.com|ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3}}<br />
** Fixed at: {{PgCommitURL|d0f952691}}<br />
* {{messageLink|d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net|Possible regression setting GUCs on \connect}}<br />
** Owner: Alexander Korotkov<br />
** Discussion on reverting {{PgCommitURL|096dd80f3}}<br />
** Original commit: {{PgCommitURL|096dd80f3}}<br />
** Reverted at: {{PgCommitURL|b9a7a822723aebb16cbe7e5fb874e5124745b07e}}<br />
<br />
* Planner makes improper clause pushdown decisions due to outer-join-aware-Vars changes<br />
** {{messageLink|0b819232-4b50-f245-1c7d-c8c61bf41827@postgrespro.ru|Clause accidentally pushed down}}<br />
** {{messageLink|CAHewXNks3w_Vy9CWoVtHx1XSaeiFpsOzh-zy5eu0Khp1PtG1sA@mail.gmail.com|wrong results due to qual pushdown}}<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** Fixed at: {{PgCommitURL|9df8f903eb6758be5a19e66cdf77e922e9329c31}}<br />
<br />
* Revert {{PgCommitURL|ec386948948}}, per {{messageLink|20230330105325.y6uvpalspynf2frt@alvherre.pgsql|Re: "variable not found in subplan target list"}}<br />
** Reverted at {{PgCommitURL|5472743d9e8}}<br />
<br />
* [https://www.postgresql.org/message-id/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk%3Du_Nvd7S1UXzuQ%40mail.gmail.com ERROR: ORDER/GROUP BY expression not found in targetlist]<br />
** Fixed at: {{PgCommitURL|da5800d5fa636c6e10c9c98402d872c76aa1c8d0}}<br />
<br />
* [https://www.postgresql.org/message-id/20230212233711.GA1316@telsasoft.com various elogs hit by sqlsmith (ExecRTCheckPerms() and many prunable partitions)]<br />
** Fixed at: {{PgCommitURL|c7468c73f7b6e842a53c12eaee5578a76a8fa7a6}}<br />
<br />
* [https://www.postgresql.org/message-id/20230228235834.GC30529@telsasoft.com pg_dump: zlib compression fails for empty objects (LOs)]<br />
** Fixed at: {{PgCommitURL|00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd}}<br />
<br />
* [https://www.postgresql.org/message-id/20230227044910.GO1653@telsasoft.com pg_dump: lz4 compression uses no persistent state and writes a block header for every row]<br />
** Fixed at: {{PgCommitURL|0070b66fef21e909adb283f7faa7b1978836ad75}}<br />
<br />
* {{messageLink|3590249.1680971629@sss.pgh.pa.us|Assertion failure with parallel full hash join}}<br />
** Fixed at: {{PgCommitURL|b37d051b0e59e4324e346655a27509507813db79}}<br />
<br />
* {{messageLink|ZDDO6jaESKaBgej0@tamriel.snowman.net|De-revert "Add support for Kerberos credential delegation"}}<br />
** Owner: Stephen Frost<br />
** Original commit: {{PgCommitURL|3d4fa227bce4294ce1cc214b4a9d3b7caa3f0454}}<br />
** Revert: ({{PgCommitURL|3d03b24c350ab060bb223623bdff38835bd7afd0}}<br />
** De-Revert: {{PgCommitURL|6633cfb21691840c33816a6dacaca0b504efb895}}<br />
** Resolved at: {{PgCommitURL|f7431bca8b0138bdbce7025871560d39119565a0}}<br />
<br />
* {{messageLink|c39be3c5-c1a5-1e33-1024-16f527e251a4@enterprisedb.com|SSL tests break on non-existing system CA pool}}<br />
** Fixed at: {{PgCommitURL|0b5d1fb36adda612bd3d5d032463a6eeb0729237}}<br />
<br />
* {{messageLink|CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B%3DvVYczMugKMQ%40mail.gmail.com|VacuumUpdateCosts() logging condition incorrect for some initial values of vacuum_cost_delay}}<br />
** Fixed at: {{PgCommitURL|a9781ae11ba2fdb44a3a72c9a7ebb727140b25c5}}<br />
<br />
* {{messageLink|CA%2BhUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ%3DmeA%40mail.gmail.com|check_strxfrm_bug()}}<br />
** Owner: Thomas Munro<br />
** Fixed at: {{PgCommitURL|7d3d72b55edd1b7552a9a358991555994efab0e9}}<br />
<br />
* {{messageLink|20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de|Should we remove vacuum_defer_cleanup_age?}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|1118cd37eb61e6a2428f457a8b2026a7bb3f801a}}<br />
<br />
* {{messageLink|2fefa454-5a70-2174-ddbf-4a0e41537139@gmail.com|Add two missing tests in 035_standby_logical_decoding.pl}}<br />
** Fixed at: {{PgCommitURL|376dc820531bafcbf105fff74c5b14c23d9950af}}<br />
** Fixed at: {{PgCommitURL|a6e04b1d20c2e9cece9b64bb5b36ebfdc3a9031b}}<br />
<br />
* {{messageLink|b32bed1b-0746-9b20-1472-4bdc9ca66d52@gmail.com|Performance regression due to SQLValueFunction removal}}<br />
** Fixed at: {{PgCommitURL|d8c3106bb60e4f87be595f241e173ba3c2b7aa2c}}<br />
<br />
* {{messageLink|20230419172326.dhgyo4wrrhulovt6%40awork3.anarazel.de|pg_stat_io not tracking smgrwriteback() is confusing}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|093e5c57d506783a95dd8feddd9a3f2651e1aeba}}<br />
<br />
* {{messageLink|ZFhCyn4Gm2eu60rB@paquier.xyz|Table data compression is broken with pg_dump --compress lz4}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|1a05c1d252993b0a59c58a6daf91a2df9333044f}}<br />
<br />
* {{messageLink|94ae9bca-5ebb-1e68-bb7b-4f32e89fefbe@gmail.com|Valgrind unhappy with LZ4F code in pg_dump}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|3c18d90f8907e53c3021fca13ad046133c480e4d}}<br />
<br />
* {{messageLink|20230509190247.3rrplhdgem6su6cg@awork3.anarazel.de|walsender performance regression due to logical decoding on standby changes}}<br />
** Owner: Andres Freund<br />
** Original commit: {{PgCommitURL|e101dfac}}<br />
** Fixed at: {{PgCommitURL|bc971f4025c378ce500d86597c34b0ef996d4d8c}}<br />
<br />
== Won't Fix ==<br />
<br />
* Is it OK that WL_SOCKET_ACCEPT is less fair on Windows than on Unix (and than the coding before 16) when there are multiple server sockets configured?<br />
** {{messageLink|CA%2BhUKG%2BA2dk29hr5zRP3HVJQ-_PncNJM6HVQ7aaYLXLRBZU-xw%40mail.gmail.com|WL_SOCKET_ACCEPT fairness on Windows}} has a (blind) patch to fix that, but would need a Windows hacker to test<br />
** Owner: Thomas Munro<br />
** Original commit: {{PgCommitURL|7389aad6}}<br />
** Issue reclassified as a non-critical improvement to be [https://commitfest.postgresql.org/43/4263/ considered for 17]<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* Beta 2: TBD<br />
* Beta 1: May 25, 2023<br />
* Feature Freeze: April 8, 2023 0:00 AoE ('''Last Day to Commit Features''')<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
* [[PostgreSQL 15 Open Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items&diff=37839PostgreSQL 16 Open Items2023-05-19T09:12:39Z<p>Alvherre: /* Open Issues */ fix markup problem</p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
'''NOTE''': If known, please list the Owner of the open item.<br />
<br />
* Switch to ICU for 17?<br />
** Owner: Jeff Davis<br />
** {{messageLink|82c4c816-06f6-d3e3-ba02-fca4a5cef065@enterprisedb.com|I suggest waiting until next week to commit it and then see what happens}}<br />
** [https://commitfest.postgresql.org/42/4169/ CF Entry]<br />
* {{messageLink|e587e2ee-7de0-88a2-10f8-c7cf001bab8c%40postgrespro.ru|psql: Add role's membership options to the \du+ command}}<br />
** [https://commitfest.postgresql.org/43/4116/ CF Entry]<br />
** NOTE: This is not a committed feature for v16<br />
* {{messageLink|874jp9f5jo.fsf@news-spur.riddles.org.uk|The rules for choosing default ICU locale seem pretty unfriendly}}<br />
** Owner: Jeff Davis<br />
* {{messageLink|CAMbWs4-_vwkBij4XOQ5ukxUvLgwTm0kS5_DO9CicUeKbEfKjUw%40mail.gmail.com|Assert failure of the cross-check for nullingrels}}<br />
** Owner: Tom Lane<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** [https://commitfest.postgresql.org/43/4250/ CF Entry]<br />
* {{messageLink|20230509190247.3rrplhdgem6su6cg@awork3.anarazel.de|walsender performance regression due to logical decoding on standby changes}}<br />
** Owner: Andres Freund<br />
** Original commit: {{PgCommitURL|e101dfac}}<br />
** {{messageLink|CALj2ACWeo64RSqf8tDbnSSUm_vbpK5GYdDiiFQk8E3Fg38mBdw@mail.gmail.com|Patch sent}}<br />
* {{messageLink|CAHewXNnu7u1aT%3D%3DWjnCRa%2BSzKb6s80hvwPP_9eMvvvtdyFdqjw%40mail.gmail.com|ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3}}<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
* [https://www.postgresql.org/message-id/268fd337-8bb7-92e6-0da2-416c022c11f3%40enterprisedb.com Reconsider a utility_query_id GUC to control if query jumbling of utilities can go through the past string-only mode and the new mode?]<br />
** Potential owner: Michael Paquier<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer] [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report] [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report] [https://www.postgresql.org/message-id/flat/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA%40mail.gmail.com Another report] [https://www.postgresql.org/message-id/flat/a53cacb0-8835-57d6-31e4-4c5ef196de1a@deepbluecap.com Another report]<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/ZArVOMifjzE7f8W7%40paquier.xyz Requiring recovery.signal or standby.signal when recovering with a backup_label]<br />
** This is a rather old behavior that affects all stable branches, still not something that should be backpatched as-is.<br />
<br />
* {{messageLink|cfcca574-6967-c5ab-7dc3-2c82b6723b99@mail.ru|pg_visibility's pg_check_visible() yields false positive when working in parallel with autovacuum}}<br />
** {{messageLink|1649062270.289865713@f403.i.mail.ru|Thread with patch}} [https://commitfest.postgresql.org/43/3739/ CF Entry]<br />
<br />
* {{messageLink|17862-1ab8f74b0f7b0611@postgresql.org|WindowAgg startup costs don't take into account partition bound. Can lead to incorrect use of cheap startup plans}}<br />
** {{messageLink|CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com|Patch to fix and discussion}}<br />
<br />
* {{messageLink|1516594.1681482708@sss.pgh.pa.us|We are not compatible with newly-released LLVM 16}}<br />
<br />
* {{messageLink|20230314174521.74jl6ffqsee5mtug%40awork3.anarazel.de|DROP DATABASE is interruptible}}<br />
** Additional discussion: {{messageLink|01020187577238cf-da8c0f4a-3ab9-445a-8c74-31ef51439f30-000000%40eu-west-1.amazonses.com|"PANIC: could not open critical system index 2662" - twice}}<br />
<br />
* {{messageLink|CAJ7c6TMBTN3rcz4%3DAjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com| EPQ misbehaves for inherited/partitioned tables}}<br />
** Owner: Tom Lane (86dc90056)<br />
<br />
=== Fixed issues ===<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
** Fixed at: {{PgCommitURL|8fcb32db98eda1ad2a0c0b40b1cbb5d9a7aa68f0}} and {{PgCommitURL|ffd1b6bb6f8a2ffc929699772610c6925364dbb3}} for HEAD.<br />
<br />
* [https://www.postgresql.org/message-id/flat/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Fix fseek() detection of unseekable files on WIN32]<br />
** Fixed at: {{PgCommitURL|a923e21631a29dc8b8781d7d02b5003d0df64ca3}} and {{PgCommitURL|765f5df726918bcdcfd16bcc5418e48663d1dd59}}, down to 14.<br />
<br />
* {{messageLink|CAAKRu_bETD%2BAri600h6fRjX2p8rJSeMAUp%3D_y88juqOZgouTSg%40mail.gmail.com|Can't disable autovacuum cost delay through storage parameter}}<br />
** Fixed at: {{PgCommitURL|bfac8f8bc4a44c67c9f35b5266676278e4ba1217}}, down to 11.<br />
<br />
== Non-bugs ==<br />
<br />
== Resolved Issues ==<br />
<br />
=== resolved before 16beta1 ===<br />
<br />
* {{messageLink|d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net|Possible regression setting GUCs on \connect}}<br />
** Owner: Alexander Korotkov<br />
** Discussion on reverting {{PgCommitURL|096dd80f3}}<br />
** Original commit: {{PgCommitURL|096dd80f3}}<br />
** Fixed at (bug only): {{PgCommitURL|db93e739ac61332126207b16f14da93f8ecac594}}<br />
** Fixed at (feature reverted): {{PgCommitURL|b9a7a822723aebb16cbe7e5fb874e5124745b07e}}<br />
<br />
* Planner makes improper clause pushdown decisions due to outer-join-aware-Vars changes<br />
** {{messageLink|0b819232-4b50-f245-1c7d-c8c61bf41827@postgrespro.ru|Clause accidentally pushed down}}<br />
** {{messageLink|CAHewXNks3w_Vy9CWoVtHx1XSaeiFpsOzh-zy5eu0Khp1PtG1sA@mail.gmail.com|wrong results due to qual pushdown}}<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** Fixed at: {{PgCommitURL|9df8f903eb6758be5a19e66cdf77e922e9329c31}}<br />
<br />
* Revert {{PgCommitURL|ec386948948}}, per {{messageLink|20230330105325.y6uvpalspynf2frt@alvherre.pgsql|Re: "variable not found in subplan target list"}}<br />
** Reverted at {{PgCommitURL|5472743d9e8}}<br />
<br />
* [https://www.postgresql.org/message-id/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk%3Du_Nvd7S1UXzuQ%40mail.gmail.com ERROR: ORDER/GROUP BY expression not found in targetlist]<br />
** Fixed at: {{PgCommitURL|da5800d5fa636c6e10c9c98402d872c76aa1c8d0}}<br />
<br />
* [https://www.postgresql.org/message-id/20230212233711.GA1316@telsasoft.com various elogs hit by sqlsmith (ExecRTCheckPerms() and many prunable partitions)]<br />
** Fixed at: {{PgCommitURL|c7468c73f7b6e842a53c12eaee5578a76a8fa7a6}}<br />
<br />
* [https://www.postgresql.org/message-id/20230228235834.GC30529@telsasoft.com pg_dump: zlib compression fails for empty objects (LOs)]<br />
** Fixed at: {{PgCommitURL|00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd}}<br />
<br />
* [https://www.postgresql.org/message-id/20230227044910.GO1653@telsasoft.com pg_dump: lz4 compression uses no persistent state and writes a block header for every row]<br />
** Fixed at: {{PgCommitURL|0070b66fef21e909adb283f7faa7b1978836ad75}}<br />
<br />
* {{messageLink|3590249.1680971629@sss.pgh.pa.us|Assertion failure with parallel full hash join}}<br />
** Fixed at: {{PgCommitURL|b37d051b0e59e4324e346655a27509507813db79}}<br />
<br />
* {{messageLink|ZDDO6jaESKaBgej0@tamriel.snowman.net|De-revert "Add support for Kerberos credential delegation"}}<br />
** Owner: Stephen Frost<br />
** Original commit: {{PgCommitURL|3d4fa227bce4294ce1cc214b4a9d3b7caa3f0454}}<br />
** Revert: ({{PgCommitURL|3d03b24c350ab060bb223623bdff38835bd7afd0}}<br />
** De-Revert: {{PgCommitURL|6633cfb21691840c33816a6dacaca0b504efb895}}<br />
** Resolved at: {{PgCommitURL|f7431bca8b0138bdbce7025871560d39119565a0}}<br />
<br />
* {{messageLink|c39be3c5-c1a5-1e33-1024-16f527e251a4@enterprisedb.com|SSL tests break on non-existing system CA pool}}<br />
** Fixed at: {{PgCommitURL|0b5d1fb36adda612bd3d5d032463a6eeb0729237}}<br />
<br />
* {{messageLink|CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B%3DvVYczMugKMQ%40mail.gmail.com|VacuumUpdateCosts() logging condition incorrect for some initial values of vacuum_cost_delay}}<br />
** Fixed at: {{PgCommitURL|a9781ae11ba2fdb44a3a72c9a7ebb727140b25c5}}<br />
<br />
* {{messageLink|CA%2BhUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ%3DmeA%40mail.gmail.com|check_strxfrm_bug()}}<br />
** Owner: Thomas Munro<br />
** Fixed at: {{PgCommitURL|7d3d72b55edd1b7552a9a358991555994efab0e9}}<br />
<br />
* {{messageLink|20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de|Should we remove vacuum_defer_cleanup_age?}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|1118cd37eb61e6a2428f457a8b2026a7bb3f801a}}<br />
<br />
* {{messageLink|2fefa454-5a70-2174-ddbf-4a0e41537139@gmail.com|Add two missing tests in 035_standby_logical_decoding.pl}}<br />
** Fixed at: {{PgCommitURL|376dc820531bafcbf105fff74c5b14c23d9950af}}<br />
** Fixed at: {{PgCommitURL|a6e04b1d20c2e9cece9b64bb5b36ebfdc3a9031b}}<br />
<br />
* {{messageLink|b32bed1b-0746-9b20-1472-4bdc9ca66d52@gmail.com|Performance regression due to SQLValueFunction removal}}<br />
** Fixed at: {{PgCommitURL|d8c3106bb60e4f87be595f241e173ba3c2b7aa2c}}<br />
<br />
* {{messageLink|20230419172326.dhgyo4wrrhulovt6%40awork3.anarazel.de|pg_stat_io not tracking smgrwriteback() is confusing}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|093e5c57d506783a95dd8feddd9a3f2651e1aeba}}<br />
<br />
* {{messageLink|ZFhCyn4Gm2eu60rB@paquier.xyz|Table data compression is broken with pg_dump --compress lz4}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|1a05c1d252993b0a59c58a6daf91a2df9333044f}}<br />
<br />
* {{messageLink|94ae9bca-5ebb-1e68-bb7b-4f32e89fefbe@gmail.com|Valgrind unhappy with LZ4F code in pg_dump}}<br />
** Owner: Tomas Vondra<br />
** Fixed at: {{PgCommitURL|3c18d90f8907e53c3021fca13ad046133c480e4d}}<br />
<br />
== Won't Fix ==<br />
<br />
* Is it OK that WL_SOCKET_ACCEPT is less fair on Windows than on Unix (and than the coding before 16) when there are multiple server sockets configured?<br />
** {{messageLink|CA%2BhUKG%2BA2dk29hr5zRP3HVJQ-_PncNJM6HVQ7aaYLXLRBZU-xw%40mail.gmail.com|WL_SOCKET_ACCEPT fairness on Windows}} has a (blind) patch to fix that, but would need a Windows hacker to test<br />
** Owner: Thomas Munro<br />
** Original commit: {{PgCommitURL|7389aad6}}<br />
** Issue reclassified as a non-critical improvement to be [https://commitfest.postgresql.org/43/4263/ considered for 17]<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* Beta 2: TBD<br />
* Beta 1: May 25, 2023<br />
* Feature Freeze: April 8, 2023 0:00 AoE ('''Last Day to Commit Features''')<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
* [[PostgreSQL 15 Open Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2023_Developer_Meeting&diff=37838PgCon 2023 Developer Meeting2023-05-19T08:58:33Z<p>Alvherre: /* RSVPs */ add $self to not-attending list</p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Tuesday 30 May, 2023 at the University of Ottawa, prior to pgCon 2023. In order to keep the numbers manageable, this meeting is by '''invitation only'''.<br />
Any questions regarding the invitations to this event should be directed to the team of individuals tasked with coming up with the list of people to invite:<br />
<br />
* Andres Freund<br />
* Stephen Frost<br />
* Dave Page<br />
<br />
An Unconference will be held on Friday for in-depth discussion of technical topics.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Define the schedule for the upcoming releases<br />
* Address any proposed timing, policy, or procedure issues<br />
* Receive updates from project sub-teams on their activities and discuss any resulting issues or concerns.<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will (probably) be:<br />
<br />
* 9:00AM to 12PM<br />
* DMS 3105 - Desmarais Hall, 55 Laurier Avenue East<br />
* University of Ottawa.<br />
<br />
Lunch will be served during the meeting.<br />
<br />
== COVID-19 ==<br />
<br />
The University of Ottawa's COVID-19 guidance can be found at https://www.uottawa.ca/en/covid-19. Wearing of masks at the Developer Meeting will be optional, however we do ask that people do not attend if they have COVID symptoms or have tested positive.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname). Note that we can accommodate a '''maximum of 30'''!<br />
<br />
# Nathan Bossart<br />
# Joe Conway<br />
# Jeff Davis<br />
# Mark Dilger<br />
# Peter Eisentraut<br />
# Andres Freund<br />
# Stephen Frost<br />
# Etsuro Fujita<br />
# Peter Geoghegan<br />
# Magnus Hagander<br />
# Amit Kapila<br />
# Jonathan Katz<br />
# Alexander Korotkov<br />
# Tom Lane<br />
# Heikki Linnakangas<br />
# Noah Misch<br />
# Thomas Munro<br />
# Dave Page<br />
# Michael Paquier<br />
# Melanie Plageman<br />
# David Rowley<br />
# Masahiko Sawada<br />
# Tomas Vondra<br />
<br />
The following people will not be in Ottawa, and do not plan to attend:<br />
<br />
# Masao Fujii<br />
# Daniel Gustafsson<br />
# Álvaro Herrera<br />
# Tatsuo Ishii<br />
# Amit Langote<br />
# Dean Rasheed<br />
<br />
== Agenda Items ==<br />
<br />
* 16.0 release and commitfest schedule (Dave)<br />
* Improvements to table AM API (Alexander)<br />
* Renaming "master" branch to "main"? (Michael)<br />
* ''Please add suggestions for agenda items here. (with your name)''<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|- <br />
|09:10 - 09:20<br />
|Release and commitfest schedules<br />
|Dave Page<br />
<br />
|- <br />
|??:?? - ??:??<br />
|TBD<br />
|TBD<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 11:00<br />
|Coffee break<br />
|All<br />
<br />
|- <br />
|??:?? - ??:??<br />
|TBD<br />
|TBD<br />
<br />
|- <br />
|11:50 - 12:00<br />
|Any other business<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:00<br />
|Lunch<br />
|<br />
<br />
|}<br />
<br />
Note: This timetable is a rough guide only. Items will start as soon as the previous discussion is complete (breaks will not move materially however). Any remaining time before lunch may be used for Commitfest item triage or other activities.<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items&diff=37832PostgreSQL 16 Open Items2023-05-17T17:47:38Z<p>Alvherre: Fixed: "Possible regression setting GUCs on \connect"</p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
'''NOTE''': If known, please list the Owner of the open item.<br />
<br />
* Switch to ICU for 17?<br />
** Owner: Jeff Davis<br />
** {{messageLink|82c4c816-06f6-d3e3-ba02-fca4a5cef065@enterprisedb.com|I suggest waiting until next week to commit it and then see what happens}}<br />
** [https://commitfest.postgresql.org/42/4169/ CF Entry]<br />
* {{messageLink|e587e2ee-7de0-88a2-10f8-c7cf001bab8c%40postgrespro.ru|psql: Add role's membership options to the \du+ command}}<br />
** [https://commitfest.postgresql.org/43/4116/ CF Entry]<br />
** NOTE: This is not a committed feature for v16<br />
* {{messageLink|874jp9f5jo.fsf@news-spur.riddles.org.uk|The rules for choosing default ICU locale seem pretty unfriendly}}<br />
** Owner: Jeff Davis<br />
* {{messageLink|20230419172326.dhgyo4wrrhulovt6%40awork3.anarazel.de|pg_stat_io not tracking smgrwriteback() is confusing}}<br />
** Owner: Andres Freund<br />
* {{messageLink|94ae9bca-5ebb-1e68-bb7b-4f32e89fefbe@gmail.com|Valgrind unhappy with LZ4F code in pg_dump}}<br />
** Owner: Tomas Vondra<br />
* {{messageLink|ZFhCyn4Gm2eu60rB@paquier.xyz|Table data compression is broken with pg_dump --compress lz4}}<br />
** Owner: Tomas Vondra<br />
* {{messageLink|CAMbWs4-_vwkBij4XOQ5ukxUvLgwTm0kS5_DO9CicUeKbEfKjUw%40mail.gmail.com|Assert failure of the cross-check for nullingrels}}<br />
** Owner: Tom Lane<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** [https://commitfest.postgresql.org/43/4250/ CF Entry]<br />
* {{messageLink|20230509190247.3rrplhdgem6su6cg@awork3.anarazel.de|walsender performance regression due to logical decoding on standby changes}}<br />
** Owner: Andres Freund<br />
** Original commit: {{PgCommitURL|e101dfac}}<br />
** {{messageLink|CALj2ACWeo64RSqf8tDbnSSUm_vbpK5GYdDiiFQk8E3Fg38mBdw@mail.gmail.com|Patch sent}}<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
* [https://www.postgresql.org/message-id/268fd337-8bb7-92e6-0da2-416c022c11f3%40enterprisedb.com Reconsider a utility_query_id GUC to control if query jumbling of utilities can go through the past string-only mode and the new mode?]<br />
** Potential owner: Michael Paquier<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer] [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report] [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report] [https://www.postgresql.org/message-id/flat/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA%40mail.gmail.com Another report] [https://www.postgresql.org/message-id/flat/a53cacb0-8835-57d6-31e4-4c5ef196de1a@deepbluecap.com Another report]<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/ZArVOMifjzE7f8W7%40paquier.xyz Requiring recovery.signal or standby.signal when recovering with a backup_label]<br />
** This is a rather old behavior that affects all stable branches, still not something that should be backpatched as-is.<br />
<br />
* {{messageLink|cfcca574-6967-c5ab-7dc3-2c82b6723b99@mail.ru|pg_visibility's pg_check_visible() yields false positive when working in parallel with autovacuum}}<br />
** {{messageLink|1649062270.289865713@f403.i.mail.ru|Thread with patch}} [https://commitfest.postgresql.org/43/3739/ CF Entry]<br />
<br />
* {{messageLink|17862-1ab8f74b0f7b0611@postgresql.org|WindowAgg startup costs don't take into account partition bound. Can lead to incorrect use of cheap startup plans}}<br />
** {{messageLink|CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com|Patch to fix and discussion}}<br />
<br />
* {{messageLink|1516594.1681482708@sss.pgh.pa.us|We are not compatible with newly-released LLVM 16}}<br />
<br />
* {{messageLink|20230314174521.74jl6ffqsee5mtug%40awork3.anarazel.de|DROP DATABASE is interruptible}}<br />
** Additional discussion: {{messageLink|01020187577238cf-da8c0f4a-3ab9-445a-8c74-31ef51439f30-000000%40eu-west-1.amazonses.com|"PANIC: could not open critical system index 2662" - twice}}<br />
<br />
=== Fixed issues ===<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
** Fixed at: {{PgCommitURL|8fcb32db98eda1ad2a0c0b40b1cbb5d9a7aa68f0}} and {{PgCommitURL|ffd1b6bb6f8a2ffc929699772610c6925364dbb3}} for HEAD.<br />
<br />
* [https://www.postgresql.org/message-id/flat/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Fix fseek() detection of unseekable files on WIN32]<br />
** Fixed at: {{PgCommitURL|a923e21631a29dc8b8781d7d02b5003d0df64ca3}} and {{PgCommitURL|765f5df726918bcdcfd16bcc5418e48663d1dd59}}, down to 14.<br />
<br />
* {{messageLink|CAAKRu_bETD%2BAri600h6fRjX2p8rJSeMAUp%3D_y88juqOZgouTSg%40mail.gmail.com|Can't disable autovacuum cost delay through storage parameter}}<br />
** Fixed at: {{PgCommitURL|bfac8f8bc4a44c67c9f35b5266676278e4ba1217}}, down to 11.<br />
<br />
== Non-bugs ==<br />
<br />
== Resolved Issues ==<br />
<br />
=== resolved before 16beta1 ===<br />
<br />
* {{messageLink|d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net|Possible regression setting GUCs on \connect}}<br />
** Owner: Alexander Korotkov<br />
** Discussion on reverting {{PgCommitURL|096dd80f3}}<br />
** Original commit: {{PgCommitURL|096dd80f3}}<br />
** Fixed at (bug only): {{PgCommitURL|db93e739ac61332126207b16f14da93f8ecac594}}<br />
** Fixed at (feature reverted): {{PgCommitURL|b9a7a822723aebb16cbe7e5fb874e5124745b07e}}<br />
<br />
* Planner makes improper clause pushdown decisions due to outer-join-aware-Vars changes<br />
** {{messageLink|0b819232-4b50-f245-1c7d-c8c61bf41827@postgrespro.ru|Clause accidentally pushed down}}<br />
** {{messageLink|CAHewXNks3w_Vy9CWoVtHx1XSaeiFpsOzh-zy5eu0Khp1PtG1sA@mail.gmail.com|wrong results due to qual pushdown}}<br />
** Original commit: {{PgCommitURL|2489d76c4}}<br />
** Fixed at: {{PgCommitURL|9df8f903eb6758be5a19e66cdf77e922e9329c31}}<br />
<br />
* Revert {{PgCommitURL|ec386948948}}, per {{messageLink|20230330105325.y6uvpalspynf2frt@alvherre.pgsql|Re: "variable not found in subplan target list"}}<br />
** Reverted at {{PgCommitURL|5472743d9e8}}<br />
<br />
* [https://www.postgresql.org/message-id/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk%3Du_Nvd7S1UXzuQ%40mail.gmail.com ERROR: ORDER/GROUP BY expression not found in targetlist]<br />
** Fixed at: {{PgCommitURL|da5800d5fa636c6e10c9c98402d872c76aa1c8d0}}<br />
<br />
* [https://www.postgresql.org/message-id/20230212233711.GA1316@telsasoft.com various elogs hit by sqlsmith (ExecRTCheckPerms() and many prunable partitions)]<br />
** Fixed at: {{PgCommitURL|c7468c73f7b6e842a53c12eaee5578a76a8fa7a6}}<br />
<br />
* [https://www.postgresql.org/message-id/20230228235834.GC30529@telsasoft.com pg_dump: zlib compression fails for empty objects (LOs)]<br />
** Fixed at: {{PgCommitURL|00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd}}<br />
<br />
* [https://www.postgresql.org/message-id/20230227044910.GO1653@telsasoft.com pg_dump: lz4 compression uses no persistent state and writes a block header for every row]<br />
** Fixed at: {{PgCommitURL|0070b66fef21e909adb283f7faa7b1978836ad75}}<br />
<br />
* {{messageLink|3590249.1680971629@sss.pgh.pa.us|Assertion failure with parallel full hash join}}<br />
** Fixed at: {{PgCommitURL|b37d051b0e59e4324e346655a27509507813db79}}<br />
<br />
* {{messageLink|ZDDO6jaESKaBgej0@tamriel.snowman.net|De-revert "Add support for Kerberos credential delegation"}}<br />
** Owner: Stephen Frost<br />
** Original commit: {{PgCommitURL|3d4fa227bce4294ce1cc214b4a9d3b7caa3f0454}}<br />
** Revert: ({{PgCommitURL|3d03b24c350ab060bb223623bdff38835bd7afd0}}<br />
** De-Revert: {{PgCommitURL|6633cfb21691840c33816a6dacaca0b504efb895}}<br />
** Resolved at: {{PgCommitURL|f7431bca8b0138bdbce7025871560d39119565a0}}<br />
<br />
* {{messageLink|c39be3c5-c1a5-1e33-1024-16f527e251a4@enterprisedb.com|SSL tests break on non-existing system CA pool}}<br />
** Fixed at: {{PgCommitURL|0b5d1fb36adda612bd3d5d032463a6eeb0729237}}<br />
<br />
* {{messageLink|CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B%3DvVYczMugKMQ%40mail.gmail.com|VacuumUpdateCosts() logging condition incorrect for some initial values of vacuum_cost_delay}}<br />
** Fixed at: {{PgCommitURL|a9781ae11ba2fdb44a3a72c9a7ebb727140b25c5}}<br />
<br />
* {{messageLink|CA%2BhUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ%3DmeA%40mail.gmail.com|check_strxfrm_bug()}}<br />
** Owner: Thomas Munro<br />
** Fixed at: {{PgCommitURL|7d3d72b55edd1b7552a9a358991555994efab0e9}}<br />
<br />
* {{messageLink|20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de|Should we remove vacuum_defer_cleanup_age?}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|1118cd37eb61e6a2428f457a8b2026a7bb3f801a}}<br />
<br />
* {{messageLink|2fefa454-5a70-2174-ddbf-4a0e41537139@gmail.com|Add two missing tests in 035_standby_logical_decoding.pl}}<br />
** Fixed at: {{PgCommitURL|376dc820531bafcbf105fff74c5b14c23d9950af}}<br />
** Fixed at: {{PgCommitURL|a6e04b1d20c2e9cece9b64bb5b36ebfdc3a9031b}}<br />
<br />
* {{messageLink|b32bed1b-0746-9b20-1472-4bdc9ca66d52@gmail.com|Performance regression due to SQLValueFunction removal}}<br />
** Fixed at: {{PgCommitURL|d8c3106bb60e4f87be595f241e173ba3c2b7aa2c}}<br />
<br />
<br />
== Won't Fix ==<br />
<br />
* Is it OK that WL_SOCKET_ACCEPT is less fair on Windows than on Unix (and than the coding before 16) when there are multiple server sockets configured?<br />
** {{messageLink|CA%2BhUKG%2BA2dk29hr5zRP3HVJQ-_PncNJM6HVQ7aaYLXLRBZU-xw%40mail.gmail.com|WL_SOCKET_ACCEPT fairness on Windows}} has a (blind) patch to fix that, but would need a Windows hacker to test<br />
** Owner: Thomas Munro<br />
** Original commit: {{PgCommitURL|7389aad6}}<br />
** Issue reclassified as a non-critical improvement to be [https://commitfest.postgresql.org/43/4263/ considered for 17]<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* Beta 2: TBD<br />
* Beta 1: May 25, 2023<br />
* Feature Freeze: April 8, 2023 0:00 AoE ('''Last Day to Commit Features''')<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
* [[PostgreSQL 15 Open Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items&diff=37808PostgreSQL 16 Open Items2023-05-04T10:46:30Z<p>Alvherre: fixed 'Revert ec386948948'</p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
'''NOTE''': If known, please list the Owner of the open item.<br />
<br />
* Is it OK that WL_SOCKET_ACCEPT is less fair on Windows than on Unix (and than the coding before 16) when there are multiple server sockets configured?<br />
** {{messageLink|CA%2BhUKG%2BA2dk29hr5zRP3HVJQ-_PncNJM6HVQ7aaYLXLRBZU-xw%40mail.gmail.com|WL_SOCKET_ACCEPT fairness on Windows}} has a (blind) patch to fix that, but would need a Windows hacker to test<br />
* Planner makes improper clause pushdown decisions due to outer-join-aware-Vars changes<br />
** {{messageLink|0b819232-4b50-f245-1c7d-c8c61bf41827@postgrespro.ru|Clause accidentally pushed down}}<br />
** {{messageLink|CAHewXNks3w_Vy9CWoVtHx1XSaeiFpsOzh-zy5eu0Khp1PtG1sA@mail.gmail.com|wrong results due to qual pushdown}}<br />
** Owner: Tom Lane (2489d76c4)<br />
* Switch to ICU for 17?<br />
** {{messageLink|82c4c816-06f6-d3e3-ba02-fca4a5cef065@enterprisedb.com|I suggest waiting until next week to commit it and then see what happens}}<br />
** [https://commitfest.postgresql.org/42/4169/ CF Entry]<br />
* {{messageLink|e587e2ee-7de0-88a2-10f8-c7cf001bab8c%40postgrespro.ru|psql: Add role's membership options to the \du+ command}}<br />
** [https://commitfest.postgresql.org/43/4116/ CF Entry]<br />
** NOTE: This is not a committed feature for v16<br />
* {{messageLink|874jp9f5jo.fsf@news-spur.riddles.org.uk|The rules for choosing default ICU locale seem pretty unfriendly}}<br />
* {{messageLink|20230419172326.dhgyo4wrrhulovt6%40awork3.anarazel.de|pg_stat_io not tracking smgrwriteback() is confusing}}<br />
* {{messageLink|d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net|Possible regression setting GUCs on \connect}}<br />
** Owner: Alexander Korotkov<br />
** Discussion on reverting {{PgCommitURL|096dd80f3}}<br />
** Original commit: {{PgCommitURL|096dd80f3}}<br />
** Fixed at (bug only): {{PgCommitURL|db93e739ac61332126207b16f14da93f8ecac594}}<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
* [https://www.postgresql.org/message-id/268fd337-8bb7-92e6-0da2-416c022c11f3%40enterprisedb.com Reconsider a utility_query_id GUC to control if query jumbling of utilities can go through the past string-only mode and the new mode?]<br />
** Potential owner: Michael Paquier<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer] [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report] [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report] [https://www.postgresql.org/message-id/flat/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA%40mail.gmail.com Another report] [https://www.postgresql.org/message-id/flat/a53cacb0-8835-57d6-31e4-4c5ef196de1a@deepbluecap.com Another report]<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/ZArVOMifjzE7f8W7%40paquier.xyz Requiring recovery.signal or standby.signal when recovering with a backup_label]<br />
** This is a rather old behavior that affects all stable branches, still not something that should be backpatched as-is.<br />
<br />
* {{messageLink|cfcca574-6967-c5ab-7dc3-2c82b6723b99@mail.ru|pg_visibility's pg_check_visible() yields false positive when working in parallel with autovacuum}}<br />
** {{messageLink|1649062270.289865713@f403.i.mail.ru|Thread with patch}} [https://commitfest.postgresql.org/43/3739/ CF Entry]<br />
<br />
* {{messageLink|17862-1ab8f74b0f7b0611@postgresql.org|WindowAgg startup costs don't take into account partition bound. Can lead to incorrect use of cheap startup plans}}<br />
** {{messageLink|CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com|Patch to fix and discussion}}<br />
<br />
* {{messageLink|1516594.1681482708@sss.pgh.pa.us|We are not compatible with newly-released LLVM 16}}<br />
<br />
=== Fixed issues ===<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
** Fixed at: {{PgCommitURL|8fcb32db98eda1ad2a0c0b40b1cbb5d9a7aa68f0}} and {{PgCommitURL|ffd1b6bb6f8a2ffc929699772610c6925364dbb3}} for HEAD.<br />
<br />
* [https://www.postgresql.org/message-id/flat/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Fix fseek() detection of unseekable files on WIN32]<br />
** Fixed at: {{PgCommitURL|a923e21631a29dc8b8781d7d02b5003d0df64ca3}} and {{PgCommitURL|765f5df726918bcdcfd16bcc5418e48663d1dd59}}, down to 14.<br />
<br />
* {{messageLink|CAAKRu_bETD%2BAri600h6fRjX2p8rJSeMAUp%3D_y88juqOZgouTSg%40mail.gmail.com|Can't disable autovacuum cost delay through storage parameter}}<br />
** Fixed at: {{PgCommitURL|bfac8f8bc4a44c67c9f35b5266676278e4ba1217}}, down to 11.<br />
<br />
== Non-bugs ==<br />
<br />
== Resolved Issues ==<br />
<br />
=== resolved before 16beta1 ===<br />
<br />
* Revert {{PgCommitURL|ec386948948}}, per {{messageLink|20230330105325.y6uvpalspynf2frt@alvherre.pgsql|Re: "variable not found in subplan target list"}}<br />
** Reverted at {{PgCommitURL|5472743d9e8}}<br />
<br />
* [https://www.postgresql.org/message-id/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk%3Du_Nvd7S1UXzuQ%40mail.gmail.com ERROR: ORDER/GROUP BY expression not found in targetlist]<br />
** Fixed at: {{PgCommitURL|da5800d5fa636c6e10c9c98402d872c76aa1c8d0}}<br />
<br />
* [https://www.postgresql.org/message-id/20230212233711.GA1316@telsasoft.com various elogs hit by sqlsmith (ExecRTCheckPerms() and many prunable partitions)]<br />
** Fixed at: {{PgCommitURL|c7468c73f7b6e842a53c12eaee5578a76a8fa7a6}}<br />
<br />
* [https://www.postgresql.org/message-id/20230228235834.GC30529@telsasoft.com pg_dump: zlib compression fails for empty objects (LOs)]<br />
** Fixed at: {{PgCommitURL|00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd}}<br />
<br />
* [https://www.postgresql.org/message-id/20230227044910.GO1653@telsasoft.com pg_dump: lz4 compression uses no persistent state and writes a block header for every row]<br />
** Fixed at: {{PgCommitURL|0070b66fef21e909adb283f7faa7b1978836ad75}}<br />
<br />
* {{messageLink|3590249.1680971629@sss.pgh.pa.us|Assertion failure with parallel full hash join}}<br />
** Fixed at: {{PgCommitURL|b37d051b0e59e4324e346655a27509507813db79}}<br />
<br />
* {{messageLink|ZDDO6jaESKaBgej0@tamriel.snowman.net|De-revert "Add support for Kerberos credential delegation"}}<br />
** Owner: Stephen Frost<br />
** Original commit: {{PgCommitURL|3d4fa227bce4294ce1cc214b4a9d3b7caa3f0454}}<br />
** Revert: ({{PgCommitURL|3d03b24c350ab060bb223623bdff38835bd7afd0}}<br />
** De-Revert: {{PgCommitURL|6633cfb21691840c33816a6dacaca0b504efb895}}<br />
** Resolved at: {{PgCommitURL|f7431bca8b0138bdbce7025871560d39119565a0}}<br />
<br />
* {{messageLink|c39be3c5-c1a5-1e33-1024-16f527e251a4@enterprisedb.com|SSL tests break on non-existing system CA pool}}<br />
** Fixed at: {{PgCommitURL|0b5d1fb36adda612bd3d5d032463a6eeb0729237}}<br />
<br />
* {{messageLink|CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B%3DvVYczMugKMQ%40mail.gmail.com|VacuumUpdateCosts() logging condition incorrect for some initial values of vacuum_cost_delay}}<br />
** Fixed at: {{PgCommitURL|a9781ae11ba2fdb44a3a72c9a7ebb727140b25c5}}<br />
<br />
* {{messageLink|CA%2BhUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ%3DmeA%40mail.gmail.com|check_strxfrm_bug()}}<br />
** Owner: Thomas Munro<br />
** Fixed at: {{PgCommitURL|7d3d72b55edd1b7552a9a358991555994efab0e9}}<br />
<br />
* {{messageLink|20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de|Should we remove vacuum_defer_cleanup_age?}}<br />
** Owner: Andres Freund<br />
** Fixed at: {{PgCommitURL|1118cd37eb61e6a2428f457a8b2026a7bb3f801a}}<br />
<br />
* {{messageLink|2fefa454-5a70-2174-ddbf-4a0e41537139@gmail.com|Add two missing tests in 035_standby_logical_decoding.pl}}<br />
** Fixed at: {{PgCommitURL|376dc820531bafcbf105fff74c5b14c23d9950af}}<br />
** Fixed at: {{PgCommitURL|a6e04b1d20c2e9cece9b64bb5b36ebfdc3a9031b}}<br />
<br />
== Won't Fix ==<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* Beta 1: TBD<br />
* Feature Freeze: April 8, 2023 0:00 AoE ('''Last Day to Commit Features''')<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
* [[PostgreSQL 15 Open Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_Tutorials&diff=37789PostgreSQL Tutorials2023-04-25T10:00:28Z<p>Alvherre: </p>
<hr />
<div><br />
[https://www.postgresql.org/docs/current/static/tutorial.html PostgreSQL Tutorial in docs] The PostgreSQL docs are a great place to learn about Postgres, and the PostgreSQL Tutorial is the place to start.<br />
<br />
[http://www.postgresqltutorial.com/ Postgresql tutorial site] General introduction to PostgreSQL for beginners<br />
<br />
[http://postgresguide.com/ Postgres guide] Covers fundamentals of setup, general SQL, backups, common tools, and Postgres specific tips. Also covers advanced features like HStore, arrays, JSON, and understanding performance.<br />
<br />
A collection of [https://youtube.com/playlist?list=PLpO6-HKL9JxWyXjze9eCHO9f2WHcNEDYQ video tutorials] and [https://blog.devart.com/category/products/postgresql-tools blog posts] showing how to perform basic tasks with PostgreSQL databases using dbForge Studio for PostgreSQL. <br />
<br />
== Other Tutorials ==<br />
<br />
* [http://w3resource.com/PostgreSQL/tutorial.php/ PostgreSQL Tutorial in detail]<br />
* [http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_tut01 Part 1: Getting Started With PostGIS: An almost Idiot's Guide]: Walks you thru how to configure PostGIS, Load data, and do common spatial queries.<br />
<br />
[[Category:Howto]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_Tutorials&diff=37788PostgreSQL Tutorials2023-04-25T10:00:04Z<p>Alvherre: remove links to outdated material</p>
<hr />
<div>= PostgreSQL Tutorials = <br />
<br />
[https://www.postgresql.org/docs/current/static/tutorial.html PostgreSQL Tutorial in docs] The PostgreSQL docs are a great place to learn about Postgres, and the PostgreSQL Tutorial is the place to start.<br />
<br />
[http://www.postgresqltutorial.com/ Postgresql tutorial site] General introduction to PostgreSQL for beginners<br />
<br />
[http://postgresguide.com/ Postgres guide] Covers fundamentals of setup, general SQL, backups, common tools, and Postgres specific tips. Also covers advanced features like HStore, arrays, JSON, and understanding performance.<br />
<br />
A collection of [https://youtube.com/playlist?list=PLpO6-HKL9JxWyXjze9eCHO9f2WHcNEDYQ video tutorials] and [https://blog.devart.com/category/products/postgresql-tools blog posts] showing how to perform basic tasks with PostgreSQL databases using dbForge Studio for PostgreSQL. <br />
<br />
== Other Tutorials ==<br />
<br />
* [http://w3resource.com/PostgreSQL/tutorial.php/ PostgreSQL Tutorial in detail]<br />
* [http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_tut01 Part 1: Getting Started With PostGIS: An almost Idiot's Guide]: Walks you thru how to configure PostGIS, Load data, and do common spatial queries.<br />
<br />
[[Category:Howto]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_Tutorials&diff=37787PostgreSQL Tutorials2023-04-25T09:58:14Z<p>Alvherre: /* PostgreSQL Tutorials */ remove outdated links</p>
<hr />
<div>= PostgreSQL Tutorials = <br />
<br />
[https://www.postgresql.org/docs/current/static/tutorial.html PostgreSQL Tutorial in docs] The PostgreSQL docs are a great place to learn about Postgres, and the PostgreSQL Tutorial is the place to start.<br />
<br />
[http://www.postgresqltutorial.com/ Postgresql tutorial site] General introduction to PostgreSQL for beginners<br />
<br />
[http://postgresguide.com/ Postgres guide] Covers fundamentals of setup, general SQL, backups, common tools, and Postgres specific tips. Also covers advanced features like HStore, arrays, JSON, and understanding performance.<br />
<br />
A collection of [https://youtube.com/playlist?list=PLpO6-HKL9JxWyXjze9eCHO9f2WHcNEDYQ video tutorials] and [https://blog.devart.com/category/products/postgresql-tools blog posts] showing how to perform basic tasks with PostgreSQL databases using dbForge Studio for PostgreSQL. <br />
<br />
== Other Tutorials ==<br />
<br />
=== Performance Optimization ===<br />
<br />
* Selecting [https://www.packtpub.com/sites/default/files/0301OS-Chapter-2-Database-Hardware.pdf Database Hardware] for PostgreSQL<br />
* [https://www.packtpub.com/article/server-configuration-tuning-postgresql Server Configuration Tuning]<br />
<br />
=== Configuring Apache Authentication with PostgreSQL ===<br />
* [http://www.graphica.com.au/postgres-and-apache.html Notes on Installing and Configuring PostgreSQL Authentication for Apache]<br />
<br />
=== Other ===<br />
* [http://www.postgresqlforbeginners.com/ PostgreSQL for beginners]<br />
* [http://w3resource.com/PostgreSQL/tutorial.php/ PostgreSQL Tutorial in detail]<br />
* [http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_tut01 Part 1: Getting Started With PostGIS: An almost Idiot's Guide]: Walks you thru how to configure PostGIS, Load data, and do common spatial queries.<br />
<br />
[[Category:Howto]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Aggregate_strict_min_and_max&diff=37786Aggregate strict min and max2023-04-25T09:49:01Z<p>Alvherre: change <source> -> <syntaxhighlight>, as a test</p>
<hr />
<div><br />
The aggregates strict_min and strict_max behave like their built-in counterparts, except they will return NULL if any input is NULL. This implementation is substantially slower than the built-in aggregates, and cannot make use of index support.<br />
<br />
== Strict min and max aggregates implementation ==<br />
{{SnippetInfo2|Strict min and max aggregates|version= at least back to 8.4|lang=SQL}}<br />
<br />
<br />
<syntaxhighlight lang="sql"><br />
-- If no values have been delivered to the aggregate, the internal state is the<br />
-- NULL array. If a null values has been delivered, it is an array with one<br />
-- element, which is NULL. Otherwise, it is an array with one element,<br />
-- the least/greatest seen to this point.<br />
CREATE OR REPLACE FUNCTION strict_min_agg (anyarray,anyelement )<br />
RETURNS anyarray LANGUAGE sql IMMUTABLE AS $$<br />
SELECT CASE<br />
WHEN $1 IS NULL THEN ARRAY[$2]<br />
WHEN $1[1] IS NULL THEN $1<br />
WHEN $2 IS NULL THEN ARRAY[$2] -- use $2 not NULL to preserve type<br />
ELSE ARRAY[least($1[1],$2)] END ;<br />
$$;<br />
<br />
CREATE OR REPLACE FUNCTION strict_agg_final (anyarray)<br />
RETURNS anyelement LANGUAGE sql IMMUTABLE AS $$<br />
SELECT CASE when $1 is null then NULL else $1[1] END ;<br />
$$;<br />
<br />
CREATE AGGREGATE strict_min (anyelement) (<br />
sfunc = strict_min_agg,<br />
stype = anyarray,<br />
finalfunc = strict_agg_final<br />
);<br />
<br />
CREATE OR REPLACE FUNCTION strict_max_agg (anyarray,anyelement )<br />
RETURNS anyarray LANGUAGE sql IMMUTABLE AS $$<br />
SELECT CASE<br />
WHEN $1 IS NULL THEN ARRAY[$2]<br />
WHEN $1[1] IS NULL THEN $1<br />
WHEN $2 IS NULL THEN ARRAY[$2] -- use $2 not NULL to preserve type<br />
ELSE ARRAY[greatest($1[1],$2)] END ;<br />
$$;<br />
<br />
CREATE AGGREGATE strict_max (anyelement) (<br />
sfunc = strict_max_agg,<br />
stype = anyarray,<br />
finalfunc = strict_agg_final<br />
);<br />
</syntaxhighlight><br />
<br />
=== parallel enabled implementation ===<br />
<br />
<syntaxhighlight lang="sql"><br />
-- for versions 9.6 add two functions and change the aggregates.<br />
CREATE FUNCTION strict_max_combine(anyarray, anyarray) RETURNS anyarray<br />
LANGUAGE sql IMMUTABLE<br />
AS $_$<br />
select case <br />
when $1 is null then $2<br />
when $2 is null then $1<br />
when $1[1] is null then $1<br />
when $2[1] is null then $2<br />
else ARRAY[greatest($1[1],$2[1])] END ;<br />
$_$;<br />
<br />
CREATE FUNCTION strict_min_combine(anyarray, anyarray) RETURNS anyarray<br />
LANGUAGE sql IMMUTABLE<br />
AS $_$<br />
select case <br />
when $1 is null then $2<br />
when $2 is null then $1<br />
when $1[1] is null then $1<br />
when $2[1] is null then $2<br />
else ARRAY[least($1[1],$2[1])] END ;<br />
$_$;<br />
<br />
CREATE AGGREGATE strict_max(anyelement) (<br />
SFUNC = strict_max_agg,<br />
STYPE = anyarray,<br />
FINALFUNC = strict_agg_final,<br />
COMBINEFUNC = strict_max_combine,<br />
PARALLEL = safe<br />
);<br />
<br />
CREATE AGGREGATE strict_min(anyelement) (<br />
SFUNC = strict_min_agg,<br />
STYPE = anyarray,<br />
FINALFUNC = strict_agg_final,<br />
COMBINEFUNC = strict_min_combine,<br />
PARALLEL = safe<br />
);<br />
<br />
</syntaxhighlight><br />
<br />
== Usage ==<br />
<br />
<syntaxhighlight lang="sql">select strict_min(x) from (values (1),(-4),(NULL),(-87)) f(x);</syntaxhighlight><br />
<br />
<syntaxhighlight lang="sql">SELECT group_id, strict_max(some_date) FROM t group by group_id;</syntaxhighlight><br />
<br />
[[Category:SQL]]<br />
[[Category:{{{category|}}} Snippets]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items&diff=37702PostgreSQL 16 Open Items2023-03-30T11:26:46Z<p>Alvherre: </p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
* [https://www.postgresql.org/message-id/20230227044910.GO1653@telsasoft.com pg_dump: lz4 compression uses no persistent state and writes a block header for every row]<br />
** Owner: Tomas Vondra - {{PgCommitURL|0da243fed}}<br />
<br />
* Revert ec386948948?<br />
** {{messageLink|20230330105325.y6uvpalspynf2frt@alvherre.pgsql|Re: "variable not found in subplan target list"}}<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
* [https://www.postgresql.org/message-id/268fd337-8bb7-92e6-0da2-416c022c11f3%40enterprisedb.com Reconsider a utility_query_id GUC to control if query jumbling of utilities can go through the past string-only mode and the new mode?]<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer] [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report] [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report] [https://www.postgresql.org/message-id/flat/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA%40mail.gmail.com Another report] [https://www.postgresql.org/message-id/flat/a53cacb0-8835-57d6-31e4-4c5ef196de1a@deepbluecap.com Another report]<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/ZArVOMifjzE7f8W7%40paquier.xyz Requiring recovery.signal or standby.signal when recovering with a backup_label]<br />
** This is a rather old behavior that affects all stable branches, still not something that should be backpatched as-is.<br />
<br />
* [https://www.postgresql.org/message-id/flat/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Fix fseek() detection of unseekable files on WIN32]<br />
<br />
=== Fixed issues ===<br />
<br />
== Non-bugs ==<br />
<br />
== Resolved Issues ==<br />
<br />
=== resolved before 16beta1 ===<br />
<br />
* [https://www.postgresql.org/message-id/CAEZATCWETioXs5kY8vT6BVguY41_wD962VDk%3Du_Nvd7S1UXzuQ%40mail.gmail.com ERROR: ORDER/GROUP BY expression not found in targetlist]<br />
** Fixed at: {{PgCommitURL|da5800d5fa636c6e10c9c98402d872c76aa1c8d0}}<br />
<br />
* [https://www.postgresql.org/message-id/20230212233711.GA1316@telsasoft.com various elogs hit by sqlsmith (ExecRTCheckPerms() and many prunable partitions)]<br />
** Fixed at: {{PgCommitURL|c7468c73f7b6e842a53c12eaee5578a76a8fa7a6}}<br />
<br />
* [https://www.postgresql.org/message-id/20230228235834.GC30529@telsasoft.com pg_dump: zlib compression fails for empty objects (LOs)]<br />
** Fixed at: {{PgCommitURL|00d9dcf5bebbb355152a60f0e2120cdf7f9e7ddd}}<br />
<br />
== Won't Fix ==<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* Beta 1: TBD<br />
* Feature Freeze: April 8, 2023 0:00 AoE ('''Last Day to Commit Features''')<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
* [[PostgreSQL 15 Open Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Open_Items&diff=37701Open Items2023-03-30T11:25:12Z<p>Alvherre: update link</p>
<hr />
<div>#REDIRECT [[PostgreSQL_16_Open_Items]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=FOSDEM/PGDay_2018_Developer_Meeting&diff=37574FOSDEM/PGDay 2018 Developer Meeting2023-02-10T08:44:02Z<p>Alvherre: </p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Thursday 1st February, 2018 at the Brussels Marriott Hotel, prior to FOSDEM/PGDay 2018. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 10 and 11 release cycles. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Review the progress of the 11.0 schedule, and formulate plans to address any issues<br />
* Address any proposed timing, policy, or procedure issues<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be:<br />
<br />
* 9:00AM to 5:00PM<br />
* Brussels Marriott Hotel<br />
<br />
Coffee, tea and snacks will be served starting at 8:45am. Lunch will be provided.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname) and will be attending:<br />
<br />
* Oleg Bartunov<br />
* Joe Conway<br />
* Andres Freund<br />
* Stephen Frost<br />
* Magnus Hagander<br />
* Petr Jelinek<br />
* Bruce Momjian<br />
* Alexander Korotkov<br />
* Dave Page<br />
* Simon Riggs<br />
* Andreas Seltenreich<br />
* Tomas Vondra<br />
<br />
The following people have sent their apologies:<br />
<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* Etsuro Fujita<br />
* Andrew Gierth<br />
* Peter Geoghegan<br />
* Robert Haas<br />
* Alvaro Herrera<br />
* Kyotaro Horiguchi<br />
* Amit Kapila<br />
* KaiGai Kohei<br />
* Tom Lane<br />
* Fujii Masao<br />
* Noah Misch<br />
* Thomas Munro<br />
* Michael Paquier<br />
* Dean Rasheed<br />
* Craig Ringer<br />
* David Rowley<br />
* Masahiko Sawada<br />
* Pavel Stehule<br />
<br />
==Agenda Items==<br />
<br />
Please add agenda items here!<br />
<br />
* Tools for Commitfest process management - changes and future requests (Simon Riggs)<br />
* 11.0 Release Review<br />
* Commitfest item review<br />
* Pluggable storages (Alexander Korotkov)<br />
* CSN & 64-bit xids (Alexander Korotkov)<br />
* Built-in sharding (Bruce Momjian)<br />
* MERGE syntax, sqlsmith and concurrency (Simon Riggs)<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave<br />
<br />
|- <br />
|09:10 - 09:20<br />
|11.0 Release Review<br />
|All<br />
<br />
|- <br />
|09:20 - 10:00<br />
|Pluggable storage<br />
|Alexander<br />
<br />
|- <br />
|10:00 - 10:30<br />
|CSN & 64-bit xids<br />
|Alexander<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 11:00<br />
|Coffee break<br />
|All<br />
<br />
|- <br />
|11:00 - 11:45<br />
|MERGE syntax, sqlsmith and concurrency<br />
|Simon<br />
<br />
|- <br />
|11:45 - 12:30<br />
|Built-in sharding<br />
|Bruce<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:30 - 13:30<br />
|Lunch<br />
|All<br />
<br />
|- <br />
|13:30 - 15:00<br />
|Open CommitFest Item Review/Hacking time/Additional discussion<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:00 - 15:30<br />
|Tea break<br />
|All<br />
<br />
|- <br />
|15:30 - 16:35<br />
|Open CommitFest Item Review/Hacking time/Additional discussion<br />
|All<br />
<br />
|- <br />
|16:35 - 16:45<br />
|Future developer meetings<br />
|Dave<br />
<br />
|- <br />
|16:45 - 17:00<br />
|Any other business<br />
|Dave<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
|<br />
|}<br />
<br />
== Minutes ==<br />
<br />
<pre><br />
09:00 - 09:10 Welcome and Introductions<br />
=========================================<br />
<br />
Present:<br />
<br />
Joe Conway<br />
Dave Page<br />
Stephen Frost<br />
Tomas Vondra<br />
Petr Jelinek<br />
Bruce Momjian<br />
Simon Riggs<br />
Oleg Bartunov<br />
Alexander Korotkov<br />
<br />
09:10 - 09:20 11.0 Release Review<br />
===================================<br />
<br />
Stephen: Some large items outstanding in CF.<br />
Dave: We're pretty good at not delaying the release for outstanding patches now,<br />
so don't expect any blockers.<br />
Oleg: We have a very large SQL standard JSON patch outstanding. Andrew Dunstan<br />
is reviewing, but we need help.<br />
Oleg: In particular we need help writing docs as it's extremely hard for<br />
non-native English speakers.<br />
Tomas: It's hard to review patches without some docs.<br />
Alexander: You have the SQL standard to describe it<br />
Tomas: Yeah, I'm not going to read that<br />
Dave: We don't need full polished docs to do a review<br />
Joe: Yeah, a README or similar<br />
Oleg: The lead dev is always online and responsive. We do have comprehensive<br />
reference docs, but not user docs.<br />
Oleg: We also have a patch for opclass parameters. Useful for (for example)<br />
length of signature in GIST.<br />
Tomas: We have a rule about no large patches in the last commitfest<br />
Oleg: It's not too big, maybe 1000 lines and not very invasive<br />
Stephen: You can just commit it (and revert later)<br />
Simon: We need a revertfest<br />
Simon: I can't help thinking we're doing this commitfest thing wrong. We have a<br />
big queue of patches, and your proposing quite reasonable additions that we're<br />
unlikely to get to. We need more reviewers.<br />
Oleg: Maybe we need a development version<br />
Simon: No, I think the issue is lack of primary review<br />
Tomas: I don't think a developer version is the way. We'll get large,<br />
unfinished patches committed, and then what? What does it solve? You could just<br />
share a branch in your private repo.<br />
Stephen: One thing that Andres does well is to include a big description and<br />
latest summary of large patches when emailing them.<br />
Tomas: One way to get reviews is to solicit them directly. We also need people<br />
to review as well as submit patches.<br />
Stephen: Reviewing patches is a great way to learn<br />
Tomas: If you invest time in writing patches, you need to also invest time in<br />
reviewing.<br />
Stephem: Companies need to bake in time to work with the community<br />
Oleg: We try! But find it hard to work in the lists.<br />
<br />
09:20 - 10:00 Pluggable storage<br />
=================================<br />
<br />
Alexander: We have a thread about pluggable storage, with a patch originally<br />
from Alvaro. I've joined the development and made some improvements. There is a<br />
new storage type with a different implementation of MVCC that can do update in<br />
place, and use an undo log. It's very important that these work together. The<br />
important thing for me is even if we retain our tuple identifier, it really<br />
becomes a row identifier pointing to a number of tuples. To implement we need<br />
to add params to our heap storage method, to provide the snapshot which can be<br />
used to find the correct tuple version in the undo chain.<br />
Simon: I don't have much input on the API, but in terms of accepting it into PG,<br />
we don't have an alternative storage method at the moment. I would like this as<br />
I think it's needed for columnar storage, but we don't have such an Open Source<br />
engine, only commercial. We need a purpose for the API as we have to support<br />
it.<br />
Tomas: I think the right word is proprietary not commercial, but without an API,<br />
how can we even test anything? I think the other problem is that the storage is<br />
just a small part of the problem.<br />
Alexander: Maybe storage isn't the right word. Maybe "table engine". In PG Pro<br />
we have an in-memory storage engine that we're looking to make Open Source, but<br />
it uses FDW and is very different. Also, the EDB guys said they're going to<br />
publish zHeap as Open Source, but haven't done so yet as it's not ready.<br />
<br />
[Discussion on EDB's Open Source vs. proprietary decision making and why zHeap<br />
is likely to be open sourced, and why it might be bad if Oracle used such an API<br />
for "InnoDB for Postgres"]<br />
<br />
Stephen: It's a chicken and egg thing - we won't accept all the work for the API<br />
without a storage engine. Both are huge pieces of work.<br />
Simon: We're always changing index APIs<br />
Stephen: We need to ensure we can always extend and enhance our API, without<br />
having to worry about breaking properietary extensions.<br />
Tomas: We don't break stuff needlessly - but if it is needed, we'll do so in<br />
major releases.<br />
Alexander: I'd like to note that table engines for Postgres will be very<br />
different from MySQL - they will be tightly integrated, unlike MySQL where they<br />
are basically different DBMSs under the same roof. We shouldn't make an API<br />
just for proprietary, but for Open Source and our users first, and additionally<br />
for proprietary extensions.<br />
Dave: I think that should be our default position for everything.<br />
Joe: We should include the ability to prevent certain operations (e.g. BTREEs on<br />
columnar storage as noted by Simon)<br />
Tomas: I'd actually like the blackhole storage engine from MySQL<br />
Simon: Hannu wanted that, so data could be transmitted through WAL, but not<br />
stored locally.<br />
Oleg: We need a reference storage engine.<br />
Tomas: I'd like to work on that, so if you have any ideas.<br />
Oleg: We need a team for v12. Maybe EDB with zHeap, PG Pro with in-memory and<br />
others on the API etc.<br />
Simon: I can support that. If we say this is the API we're writing for v12,<br />
please work on storage engines for that release with that API.<br />
<br />
Oleg: (Recap): for PG 12 we need to support new storage types, in memory,<br />
columnar, zHeap etc. We need an API.<br />
Dave: We need someone to spearhead the effort like Bruce did with sharding<br />
Oleg: I want commitments from the contrbuting companies to work on parts of<br />
this.<br />
Bruce: This is like sharding (as Dave noted). I needed to re-assure Postgres<br />
users that in the future we will have a multi-node solution, and that took a<br />
lot of PR and organization.<br />
Bruce: Let me think about how I'd PR that<br />
Oleg: It's not about PR, it's about how we get things done.<br />
Tomas: It's probably the right time of the year to do this now as we're right<br />
before the last commitfest. We should do this now, and have a meeting at pgCon.<br />
<br />
Dave: All in favour of proposing to -hackers that we create a roadmap item for<br />
a pluggable storage API and engines being developed by various companies.<br />
<br />
[All voted in favour, except Andreas who abstained]<br />
<br />
TODO: Alexander to discuss with the various parties involved and prepare the<br />
proposal. Meeting to be setup at PGCon.<br />
<br />
<br />
10:00 - 10:30 CSN & 64-bit xids<br />
=================================<br />
<br />
Alexander: CSN is commit sequence number. This is an alternative idea to our<br />
current snapshots which are an array of xmin, xmin and xids. The problem with<br />
this is that it was designed for single core processors, but for servers with<br />
hundreds of cores this approach is of quadratic complexity with the number of<br />
backends. CSN gives a number which orders the commits so we can find a snapshot<br />
with a single number. Original patch by Ants, with work from Heikki and now one<br />
of the PG Pro guys. Unfortuntely most of the current community work seems to be<br />
on improving the current model. I think we need to eventually switch to CSN to<br />
avoid significant efforts on micro optimisations of the current approach.<br />
Simon: I'm confused about the difference between CSN and 64bit XIDs. Why are <br />
they the same thing?<br />
Alexander: No, they're not the same thing, but they could be overlapped in some<br />
implementation details.<br />
Tomas: How large is the CSN?<br />
Alexander: 64bit. There are some cases of regression over the current<br />
implementation, for example tables with lots of random seeks. One option is <br />
instead of writing a hint bit, we write the CSN in the tuple header.<br />
Discussion with Andres and Heikki who suggested it's hard to guarantee atomic<br />
writes.<br />
Simon: We already use 64bit atomic values in the code and clearly that works.<br />
Alexander: It's not about writing in memory but writing to disk<br />
Simon: There was a suggestion from Robert to switch to a different heap, using<br />
the storage API. We could support both then deprecate the old one.<br />
Simon: What's stopping you from having a hint bit and CSN on one tuple? That<br />
doesn't have the benefit of avoiding lookups, but it does avoid WAL<br />
Alexander: On some workloads you will have regression (but on others, benefits)<br />
Simon: It's not going to happen in this commitfest is it?<br />
Oleg: No!<br />
<br />
[Discussion on whether we have to support 32bit machines in the future (yes) and <br />
how 64bit XIDs would affect that]<br />
<br />
Alexander: The plan for 64bit would be to provide a patch to replace 32bit XIDs<br />
with 64bit XIDs in memory, and then an alternate heap with 64bit XIDs on disk.<br />
Simon: That works - if you get the first part done, all the other patches become<br />
much smaller.<br />
Oleg: Also, the CSN patch is big, but removes a lot of code from Postgres,<br />
making it cleaner.<br />
Alexander: Would you like to review CSN in light of logical decoding (Petr).<br />
Petr: Yes.<br />
<br />
SIDE TOPIC:<br />
<br />
Release date for PostgreSQL 13 agreed: Friday 13th September 2019!!<br />
<br />
<br />
11:00 - 11:45 MERGE syntax, sqlsmith and concurrency<br />
======================================================<br />
<br />
Simon: MERGE is an SQL standard command. The reason I'm working on it is that I <br />
started in 2008 and would really like to finish it! There's been a lot of<br />
misunderstanding over the years, mostly around concurrency. I think it's<br />
possible to get something into Postgres 11. The current version is 14, but as<br />
of 10a, all standard functionality works. As of 13, all concurrency works.<br />
Simon: SQLsmith. I've come here primarily to talk to you as MERGE is crushingly<br />
complex in it's syntax. I'd like to propose working with you to add MERGE<br />
support to SQLsmith, then we'll run it for 20 minutes to generate the next<br />
months work for me! I think fuzz testing is very important for this. The reason<br />
that fuzz testing is important is that MERGE works and there are no bugs in it<br />
at this time. Most people think it's half-baked crap, but it's actually not -<br />
it's clean code and it works. The code is cleanly distributed in a few areas, <br />
and all the magic happens confined to the executor. Current status is at <br />
https://wiki.postgresql.org/wiki/SQL_MERGE_Patch_Status.<br />
Simon: I consider it committable. It has extensive docs, regression and <br />
isolation tests. Works with everything we can think of.<br />
Simon: Not supported features include RLS. It will error out on tables with RLS <br />
at present, to prevent the first version potentially having security bugs.<br />
Stephen: What concerns me is that we don't normally do that with new features. <br />
Eg. Peter put a lot of effort into INSERT ON CONFLICT. I understand your point,<br />
but I don't really see this as being any different from other situations.<br />
Simon: We have done this before. Partitioning was committed without docs and<br />
without support for INSERT ON CONFLICT. It may not be possible to support<br />
MERGE with partitioning in this release as partitioning still isn't finished.<br />
Tomas: My question is, which part of the code will we have to fix to make MERGE<br />
work well with partitioning? Will we automatically get better plans with<br />
improved partitioning?<br />
Simon: I can't answer that.<br />
<br />
[Discussion about whether to include partitioning support despite likely<br />
generation of poor plans. Simon seems against, others less so]<br />
<br />
Stephen swings the conversation back to the similar issue of RLS support and<br />
agrees to a request from Simon to review for possible security risks.<br />
<br />
Alexander: Does MERGE use the same speculative insertion internal machinery as <br />
INSERT ON CONFLICT?<br />
Simon: Peter Eisentraut asked me that, and I said yes and got a one-word <br />
response of "good". Others, including Peter Geogehan seem less keen. There is<br />
ongoing discussion about whether or not to attempt an INSERT following an <br />
update failure (e.g. because a snapshot has gone away). Originally it was <br />
suggested that it throw an error which is the code Simon wrote and works. <br />
Others are now saying that was a misunderstanding and the INSERT should be <br />
attempted.<br />
<br />
[Discussion about the expected behaviour of MERGE in certain circumstances, <br />
what seems logical and what other databases do]<br />
[Further discussion about making behaviour controllable via GUC or syntax. Simon<br />
is currently preferring to throw an error until correct behaviour is clear]<br />
<br />
Bruce: I haven't heard this issue discussed this clearly on the mailing lists.<br />
As was the case with SSI, it neeeded a simplified example to help people<br />
understand the issues.<br />
Joe: This is the part where the spec says "implementation defined" right? Has<br />
anyone tabulated what the other DBMSs do so we can see if there's a clear<br />
answer?<br />
Simon: Not really. I was told to throw an error on the mailing list, so that's <br />
what I did.<br />
Joe: This could be a problem for people migrating.<br />
<br />
11:45 - 12:30 Built-in sharding<br />
=================================<br />
<br />
Bruce: Not a huge amount to report. We're in the multi-year approach to sharding<br />
and we're continuing to improve FDWs and partitioning. I've been looking at <br />
what is committed so far - PG10 almost has enough for read-only sharding. PG11<br />
has a patch for parallel foreign scans that is important.<br />
Oleg: We have a different approach for sharding.<br />
Alexander: We have an extension that uses FDWs and our tsDTM to provide <br />
transaction atomicity. It's on Github. It basically works but doesn't yet <br />
support automatic recovery.<br />
Oleg: We've tested it on 64 nodes and it works. It's Open Source. I don't know<br />
what to do about this. Maybe it's possible to include in pg_contrib?<br />
Alexander: It also uses logical replication for redundancy of shards.<br />
Stephen: It can parallelise queries?<br />
Alexander: Yes, you need several patches to the FDW/core code.<br />
Simon: It would be nice if there was a web page somewhere that explained what <br />
the missing pieces are and the overarching architecture. If we knew what <br />
patches were holding it up, we might be able to get more eyes on them.<br />
Stephen: Can you break it up into smaller parts for commit?<br />
Tomas: There was a patch for the TM API a couple of years ago<br />
Oleg: Yes, there was a problem with the API<br />
Alexander: The problem is lack of review<br />
Tomas: I think the problem was that there was no extension to show use of the<br />
TM API<br />
Oleg: There were two extensions.<br />
Tomas: I do remember that one of the problems with the patch was lack of docs<br />
Oleg: We have a wiki page on it.<br />
Tomas: It's not something that has a chance to get in before the last commitfest<br />
- I would revive it before PGCon so we can discuss there. I think over the last<br />
two years we've got a lot of infrastruture that might make it more viable.<br />
<br />
Commitfest review<br />
==============<br />
<br />
Commitfest item review, with your host, Stephen Frost...<br />
<br />
Future dev meetings<br />
- Dave- What do people want me to do?<br />
- Simon- I don't want more than 1 face-to-face per year; want it announced when/where it's going to be, if there'll be another Brussels meeting then announce soon<br />
- Dave- We don't know exact time<br />
- Simon- Does have to be an exact date<br />
- Stephen- just talking timeframe<br />
- Dave- I can't guarantee that it'll happen ahead of time<br />
- Simon- Main thing is about planning, not about booking flights<br />
- Dave- We will have one at pgCon<br />
- Bruce- Just one per year? But which one, not at pgcon?<br />
- Simon- Not being specific, just only want one per year<br />
- Dave- Should be moved around?<br />
- Simon- could be moved around but still be only one<br />
- Dave- issue with that is not everyone travels, and the timing changes - which impacts what the meeting is about and for<br />
- Simon- if there's more than one meeting then they end up being split<br />
- Dave- if only one then it'll be pgCon, probably remain as admin/procedural meeting + unconference<br />
- Tomas- Having just one would be easier to plan, will mean it's always in US or Europe and reduces number of attendees too<br />
- Dave- that's true, doesn't disagree but there are pros/cons<br />
- Simon- What was changed is that with multiple meetings we couldn't get everyone to all of them, more meetings means more cost or fewer people<br />
- Dave- of those who have been to Japan or Pgcon or here, which seems like the best timing/most useful<br />
- Tomas- Best timing for CF review/patches for next meeting, but probability of getting people here for this meeting is tough<br />
- Dave- Probably won't get Tom here<br />
- Bruce- only one dev meeting then there's no experimenting, only one means no failure possible, either pgEu or Ottawa<br />
- Tomas- Prefer 2 dev meetings, one here or pgConf.Eu & pgCon Ottawa<br />
- Dave- those two confs will likely attract people<br />
- Tomas- Would be extra cost but likely to have good number of people at both, up to Simon to some extent<br />
- Simon- Depends on people too and who wants to travel<br />
- Dave- people who don't want to travel may not go to either<br />
- Petr- big difference based on distance for travel<br />
- Dave- discussed doing one in Asia and did one previously<br />
- Tomas- Asia was a bit of a strange meeting, would keep one in Europe and one in US<br />
- Dave- just did an unconf in Asia instead of a dev meeting<br />
- Bruce- are you saying meeting like this isn't useful with 10 people..?<br />
- Dave- Not useful if no one else finds it useful, happy to continue carrying on with it or is it a waste of time? Fewer people than ever here and every year there is badgering people to get here and to have an agenda worked out<br />
- Bruce- Might not be sure what to do until they get there<br />
- Dave- suggested maybe do this as an unconference style instead of as a dev meeting<br />
- Bruce- kinda done that today<br />
- Tomas- thought it was useful and more developer patch-oriented meeting here and governance, et al, meeting at PgCon then it seems reasonable<br />
- Dave- who thinks we should carry on doing something here<br />
- Nearly everyone votes yes<br />
- Andres- Not cheap to get flights on short notice, please send out notice earlier<br />
- Tomas- here or at pgconf.eu? like it here<br />
- Dave- timing is good here, but may be fewer people<br />
- Tomas- chance that PG hackers come to fosdem is lower<br />
- Dave- stats for pgconf.eu very biased by what country it is in<br />
- Tomas- if we want to have a meeting where there are more people from US, pgconf.Eu is more likely<br />
- Dave- Europe one at FOSDEM or at pgconfEu?<br />
- General favor for pgConfEu<br />
- Dave- Will talk to Magnus about making it happen<br />
- Dave- last question- dev meeting style, or unconf or?<br />
- General favor for dev meeting style<br />
<br />
Any other business<br />
- Bruce- any comments about companies working together?<br />
- Simon- Concrete suggestions regarding meetings and CFM app improvements<br />
- Dave- Concrete suggestions? Being able to vote to help prioritize work on patches?<br />
- Simon- Need a way to say what the items are that we each care about and what things we're each going to do. If I know that something I care about is being done then I can look at reviewing another patch, otherwise can't<br />
really look at other items<br />
- Tomas- Not really a general agreement<br />
- Simon- Some form of organized teamwork would be great<br />
- Dave- need further discussion about the CF app in general<br />
- Tomas- generally feel like it works pretty well<br />
- Dave- One problem is that it takes a long time to figure out what the current status is, maybe add more metadata to the patch that anyone can update<br />
- Tomas- When there is a huge thread, once in a while the person who is working on it should provide a summary email when the new patch is submitted<br />
- Dave- Maybe that should be in metadata on the app somehow<br />
- Joe- Or a way to flag the summary message or current point in the thread<br />
- Tomas- Usually go through the thread anyway, would like to see the info embedded in the thread<br />
- Dave- if you're looking for something to spend a couple hours on, better to have a summary available in the app or linked from the app<br />
- Tomas- Should be for items which are in needs review<br />
- Petr- Could match one which is the summary one in the app, maybe with a way to mark it in the app<br />
- Tomas- That could be done, just don't want to have to maintain both the CF app and the thread<br />
- Dave- Probably no specific change to make at this time but things to discuss on the list<br />
- Tomas- issue with picking patch to review, patch in needs review with no reviewers, so go look but turns out there's a huge thread on it<br />
- Dave- out of time, need to defer discussion but with concrete suggestions we can post something to the thread<br />
<br />
</pre><br />
<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=FOSDEM/PGDay_2019_Developer_Meeting&diff=37573FOSDEM/PGDay 2019 Developer Meeting2023-02-10T08:43:57Z<p>Alvherre: </p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Thursday 31st January, 2019 at the Brussels Marriott Hotel, prior to FOSDEM/PGDay 2019. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 10 and 11 release cycles. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Review the progress of the 12.0 schedule, and formulate plans to address any issues<br />
* Address any proposed timing, policy, or procedure issues<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
* Commitfest Triage<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be:<br />
<br />
* 9:00AM to 5:00PM<br />
* Brussels Marriott Hotel<br />
<br />
Coffee, tea and snacks will be served starting at 8:45am. Lunch will be provided.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname) and will be attending:<br />
<br />
* Christoph Berg<br />
* Joe Conway<br />
* Andres Freund<br />
* Stephen Frost<br />
* Daniel Gustafsson<br />
* Devrim Gündüz<br />
* Magnus Hagander<br />
* Álvaro Herrera<br />
* Amit Langote<br />
* Thomas Munro<br />
* Dave Page<br />
* Masahiko Sawada<br />
* Tomas Vondra<br />
* Gregory Stark<br />
<br />
The following people have sent their apologies:<br />
<br />
* Peter Eisentraut<br />
* Etsuro Fujita<br />
* Peter Geoghegan<br />
* Kyotaro Horiguchi<br />
* Tatsuo Ishii<br />
* Amit Kapila<br />
* Jonathan Katz<br />
* Tom Lane<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Craig Ringer<br />
* Simon Riggs, on holiday that week<br />
* Pavel Stehule<br />
<br />
==Photo==<br />
[[File:Fosdem-dev-meeting-2019.jpg|800px]]<br />
<br />
Top row, left to right: Andres Freund, Stephen Frost, Daniel Gustafsson, Dave Page, Devrim Gunduz, Magnus Hagander, Amit Langote<br />
<br />
Bottom row, left to right: Álvaro Herrera, Thomas Munro, Greg Stark, Christoph Berg, Joe Conway, Masahiko Sawada, Tomas Vondra<br />
<br />
==Agenda Items==<br />
<br />
Please add agenda items here!<br />
<br />
* Communication between hackers and packagers (Devrim)<br />
* Bug tracking / Bug ID / Links to bug threads (Stephen, and Magnus, though he doesn't know it yet)<br />
* Contribution recognition (Stephen)<br />
* PGCon plans, and such (Stephen)<br />
* RMT for v12 (Stephen, plus whomever...)<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave<br />
<br />
|- <br />
|09:10 - 09:20<br />
|12.0 Release Review<br />
|All<br />
<br />
|- <br />
|09:20 - 09:40<br />
|Communication between hackers and packagers<br />
|Devrim<br />
<br />
|- <br />
|09:40 - 10:00<br />
|Bug tracking / Bug ID / Links to bug threads<br />
|Stephen/.Magnus<br />
<br />
|- <br />
|10:00 - 10:20<br />
|Contribution recognition<br />
|Stephen<br />
<br />
|- <br />
|10:20 - 10:30<br />
|RMT for v12<br />
|Stephen<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 11:00<br />
|Coffee break<br />
|All<br />
<br />
|- <br />
|11:00 - 11:15<br />
|PGCon plans and such<br />
|Stephen<br />
<br />
|- <br />
|11:15 - 12:30<br />
|???<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:30 - 13:30<br />
|Lunch<br />
|All<br />
<br />
|- <br />
|13:30 - 15:00<br />
|Commitfest Triage<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:00 - 15:30<br />
|Tea break<br />
|All<br />
<br />
|- <br />
|15:30 - 16:45<br />
|Commitfest Triage<br />
|All<br />
<br />
<br />
|- <br />
|16:45 - 17:00<br />
|Any other business<br />
|Dave<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
|<br />
|}<br />
<br />
== Minutes ==<br />
<br />
<pre><br />
9.20: 12.0 Release Review<br />
==========================<br />
<br />
Dave: Are we on track, is there anything we want to change about the way we do the release?<br />
Andres: We should decide on a target date<br />
Stephen: We should do that the same as last year<br />
Andres: We had a lot of last minute questions about whether patches go in...<br />
Stephen: We should do some thinking about how to avoid that<br />
Magnus: Knowing the freeze window head of time would be good<br />
Greg: Then people will plan to put something in on the last day of the freeze<br />
Stephen: We would like it if people wouldn't put large patches in the last CF<br />
Dave: A problem since ... how many years?<br />
Andres: It's OK, we just kick them into the next CF (unready, new)<br />
Stephen: Triage this afternoon may help with that. What else could we do?<br />
Magnus: For a while we had placeholder patches that people put in so they wouldn't miss the cut-off. We should kill those.<br />
Stephen: The important thing is that we really do that. We had some cases where people felt very strongly that a patch wasn't ready but it went in.<br />
Andres: We need to escalate to other committers faster. We have other committers that only take a look at major patches after commit and then they find a lot of problems.<br />
Stephen: Problem is that when someone signs up no one else will look.<br />
Magnus: We need a way to detect the cases where something needs more committers to review.<br />
Stephen: How about: if a committer says that they have a concern then maybe the item goes on the "open items" list.<br />
Magnus: How about: if one committer objects, a new process that requires another committer to ...<br />
Joe: Perhaps there should be a wording you can use to say you want to block the patch until another couple of committers get involve to resolve the problem.<br />
Stephen: Commitfest status?<br />
Andres:<br />
Stephen: If someone commits after that status is set without a resolution process, then it becomes clear to everyone that it needs to be reverted, without complicated pain.<br />
Magnus: The problem is social, not technical. Let's start with the question of who triggers the process and how it gets resolved.<br />
Andres: Perhaps CCing core, as a documented process. Who can I contact?<br />
Greg: There is a risk of doing things in private.<br />
Stephen: It might go over better if something things are not discussed in public (ie reverting).<br />
Magnus: Technical problems should be addressed on hackers.<br />
Stephen: If we do it on on private lists we give people a chance to go back onto -hackers to discuss the technical question. Giving someone an opportunity to publicly reconsider and decide on their own that they want to revert it. private-committers is better than personal emails. We need a policy, and perhaps we could discuss it at pgcon. Who wants to take that action item?<br />
Everybody:<br />
Stephen: If nobody objects I will draft a policy and then we'll see if there are objections. There will be objections.<br />
Magnus: It makes sense to float something on the private committers list, before we get to pgcon. Let's not arrive there without a proposal.<br />
Andres: Discussing it before the next BF to raise awareness would be good.<br />
Stephen: I don't want to rush it, and come across as overbearing.<br />
Andres: Right but we don't have to agree on the policy, just discuss the ideas.<br />
Magnus: So people know that the issue exists.<br />
Stephen: It's on my list of things to do before pgcon. So basically by end of February. Good discussion, I will work up a draft policy and float it.<br />
<br />
Dave: We have a wiki section with procedures.<br />
Andres: The wiki is incredibly out of date.<br />
Magnus: It was also wrong when it wasn't out of date.<br />
Stephen: Some of the policies are on the website and some are on the wiki.<br />
Magnus: We should probably make the documentation scream at you when you're looking at an old version.<br />
Stephen: Policies about development should be in the docs in the source tree.<br />
Magnus: We should have a total index of policies that points to the docs, the wiki, ...<br />
Dave: We could have them in the docs so that they're in the source tree but publish them on the website with other policies.<br />
Andres: It does seem reasonable to have developer policies all in once place.<br />
Andres: The commitfest processs should be in there.<br />
Stephen: Action item for Andres.<br />
Dave: I'm going to remove stuff from the wiki into the website. It'll take that action point. Archives policy, ... and other ones that are more or less up to date. Those that are not up to date, I'll contact those people. Most of them are probably alright.<br />
<br />
Action points: Andres to document commitfest process. Dave to move stuff from wiki to website. Stephen to propose revert policy.<br />
<br />
9:59: Communication between hackers and packagers<br />
==================================================<br />
Devrim: There was a discussion about renaming a binary. We have to dig. I would ask the hackers to drop an email to the -packagers mailing list.<br />
Dave: I have annoyed people in the past by forgetting to tell people about changes to pgadmin.<br />
Stephen: What things need to go to packagers?<br />
Devrim: Andres did a great job of communicating with me about how to package the JIT stuff.<br />
Stephen: We can put that into a policy document. Tell us what things there are... changing binary names, removing things, new dependencies, ...<br />
Christoph: No body told me about the changes to the documentation build tools...<br />
Thomas: Could RMT add a sign-off step, "have we communicated all packaging changes?"<br />
Everybody: No!<br />
Christoph: Should we enable new features by default?<br />
Stephen: That is a whole other question...<br />
Stephen: Floating point dates were a case where the packagers made a choice that we didn't directly control.<br />
Greg: In cases where there is more than one option, like different SSL library implementations, we should leave that to packagers to do whatever is the preferred approach on that platform. Packagers have real policy decisions to make, they're not robots.<br />
Christoph: I had to rewrite pg_config in perl, to support cross compiling.<br />
Alvaro: As a committer, do I need to tell you if I change a binary name?<br />
Dave: Generally it's about knowing when changes are going to happen. Right now we have no coordination with Debian packages etc. I'd like to see us to things with more consistently. But we need to know when things are changing upstream. Adding a binary, etc.<br />
Devrim: Example: a while ago I had to add support for pg_basebackup.<br />
Thomas: Does this include header files that we export, and files like the errorcodes.txt?<br />
Magnus: Devrim needs to write a policy on this.<br />
Alvaro: Should we cross-post to -packagers?<br />
Magnus, Stephen: No! Devrim will give us a specific list of things that need to be send to -packagers.<br />
Christoph: For distributions other than RHEL and Debian, people may not even be following -hackers.<br />
Dave: packagers is not open because it has security information before the general public.<br />
Stephen: Do we need another mailing list for these announcements? More open?<br />
Stephen: Consolidate various other end-user lists?<br />
Dave: Devrim to propose policy to -hackers. Someone needs to start -packagers/-hackers discussion about communication.<br />
Christoph: I will.<br />
Devrim: I would like the PDFs to be built. Sometimes they break, but nobody notices upstream.<br />
Thomas: When the HTML build was OK?<br />
Devrim: Yes.<br />
Dave: Do we need to have the PDFs built on a build farm animal? Shall we ask Andrew to make that an option?<br />
<br />
Action points: Magnus to talk to Andrew. Devrim to propose communication policy document.<br />
<br />
10:30: Coffee.<br />
<br />
11:10: Bug tracking / Bug ID / Links to bug threads <br />
====================================================<br />
<br />
Thomas: Can we add clearer links to the pgadmin bug tracker from the bug reporting page?<br />
Greg: Could we have a drop list and forward bug reports to those projects?<br />
Thomas: Well at least a clearer link...<br />
Stephen: I like the combo box. Magnus?<br />
Greg: I wouldn't have a problem with the list having Advanced Server or RDS etc.<br />
Dave: Progressive reveal from a combo box, starting with a few options and adding more as we need them.<br />
Magnus: I don't think we should generate messages for other projects.<br />
Thomas: You could leave the mailing list at the centre but track the status.<br />
Christoph: You could extend the commitfest app to do that.<br />
Magnus: I have previously proposed that.<br />
Thomas: The problem is that threads started by email (not the form) don't have an ID.<br />
Dave: We own the mailing list software, so we could assign IDs to those.<br />
Stephen: I have previously proposed that.<br />
Christoph: The Wikipedia article for PostgreSQL notes that we have no bug tracker.<br />
Greg: There are a lot of -hackers threads by Tom that describe "known problems".<br />
Greg: I would like to compile a list of those emails.<br />
Thomas: User bugs are not the same as "known problems" like "if you do this and you do that the planner gets confused".<br />
Stephen: A link on the bug reporting problem to some known problems?<br />
Dave: If you have a real bug tracker you have to do triage.<br />
Stephen: We'd have to make sure that people can keep doing exactly what they're doing it today.<br />
Alvaro: Nathan Wagner runs a system that classifies bugs by reading the -bugs mailing list. I will write to the -hackers mailing list[1].<br />
Christoph: That could become part of the CF app.<br />
Stephen: Great, let's discuss this further on -hackers.<br />
<br />
Action points: Greg to report on "known problems" from the mail list.<br />
<br />
12:10: Contribution recognition <br />
================================<br />
<br />
Stephen: Robert and I have been doing reports on contributions. How do you think that's going?<br />
Andres: It's terrible. People don't get added to Major Contributors and are driven away from the project.<br />
Christoph: Right, I've been trying to get [redacted] put on the contributors list for years but have been told to take it up at pgcon.<br />
Dave: The problem is that no one wants to take responsiblity for it.<br />
Stephen: In the past Robert and I have done it at pgcon because core is there and they tell us to do it.<br />
Dave: I can't promise but there is no reason you can't email core during the year.<br />
Andres: You should write 'I am going to add this person in three days unless you object'.<br />
Joe: I think you should have more than Stephen and Robert proposing.<br />
Stephen: You don't think Robert and I disagree enough?<br />
Dave: We need a policy on who can be proposed (including non-backend code contributors?)<br />
Stephen: We need more people.<br />
Magnus: I would suggest someone with more of an outside perspective. I would suggest [redact].<br />
Daniel: We also have the difference between the release note contributors and the website contributors.<br />
Dave: Action item: add description <br />
<br />
Action points: Dave to write better descriptions of contributor classes. Dave to follow up on adding more people the team that deals with recognition.<br />
<br />
13:40: RMT<br />
===========<br />
<br />
Alvaro: We should have one again.<br />
Stephen: Should we define one before freeze.<br />
Alvaro: The first RMT had a set of rules, but it was so annoying that it was decided not to have rules; each release's RMT decides how it is going to operate.<br />
Andres: Last year there was a feature freeze + RMT announced in March.<br />
Stephen: The last RMT should share information with the next one.<br />
Andres: Should be more aggressive?<br />
<br />
13:50: PGCon<br />
=============<br />
<br />
Stephen: Feedback on who should be at the developer meeting in Ottawa?<br />
Stephen: What is the basis for limiting the list? Can we invite more?<br />
Dave: Room size is a problem.<br />
Dave: It's very good to have packagers in the meeting as we do today.<br />
Stephen: Perhaps we should decide what we're going to talk about and then decide who should be invited.<br />
Dave: Chicken and egg.<br />
Stephen: Dave and I will take an action point to look at past agendas and make sure we're getting the right people.<br />
<br />
Action points: Dave and Stephen to review past topics.<br />
<br />
14:10: Patch triage<br />
====================<br />
<br />
<discussion not recorded in minutes><br />
<br />
[1] https://www.postgresql.org/message-id/flat/201901311104.gwxzhzxu6ns6%40alvherre.pgsql<br />
</pre><br />
<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=FOSDEM/PGDay_2020_Developer_Meeting&diff=37572FOSDEM/PGDay 2020 Developer Meeting2023-02-10T08:43:51Z<p>Alvherre: </p>
<hr />
<div>A meeting for PostgreSQL developers will be held in conjunction with the PostgreSQL Europe FOSDEM Devroom<br />
and FOSDEM/PGDay 2020 events in Brussels, Belgium. The meeting is planned for January 30. In order to keep the meeting focused, productive, and with a high level of interaction, the meeting is invite-only. Due to the large developer community, we cannot invite every active contributor. If you feel that you, or someone else, should've been invited then please contact [mailto:daniel@yesql.se Daniel Gustafsson].<br />
<br />
This is a Community event, organized and financed by PostgreSQL Europe.<br />
<br />
== Time and Location ==<br />
The meeting will be held at the [https://www3.hilton.com/en/hotels/brussels-capital-reg/hilton-brussels-grand-place-BRUGRHI/index.html Hilton Brussels Grand Place Hotel], which is the venue where [https://2020.fosdempgday.org/ FOSDEM PGDay 2020] is arranged the day after on January 31. The meeting starts at 9AM and is scheduled to end at 5PM. Coffee, tea and snacks will be provided during the day, as well as lunch.<br />
<br />
== Attendees ==<br />
The following hackers have RSVP'd to the meeting and will be attending:<br />
* Joe Conway<br />
* Dmitry Dolgov<br />
* Peter Eisentraut<br />
* Andres Freund<br />
* Stephen Frost<br />
* Daniel Gustafsson<br />
* Magnus Hagander<br />
* Bruce Momjian<br />
* Julien Rouhaud<br />
* Tomas Vondra<br />
* Vik Fearing<br />
* Devrim Gündüz<br />
* Álvaro Herrera<br />
* ..more TBA<br />
<br />
The following hackers have been invited but are unable to attend:<br />
* Andrew Dunstan<br />
* Kyotaro Horiguchi<br />
* Alexander Korotkov<br />
* Tom Lane<br />
* Amit Langote<br />
* Noah Misch<br />
* Thomas Munro<br />
* Michael Paquier<br />
* Peter Geoghegan<br />
* Jeff Davis<br />
* Masahiko Sawada<br />
* Simon Riggs<br />
* Andrew Gierth<br />
* John Naylor<br />
* Heikki Linnakangas<br />
* Robert Haas<br />
* Christoph Berg<br />
* Dave Page<br />
<br />
== Suggested Topics ==<br />
Please add topics for discussion to the list:<br />
* CI build feedback/information directly in CF app (Daniel)<br />
* Update from packagers (Devrim)<br />
* Report from SQL working group (Peter)<br />
* CoC committee report (Vik)<br />
* The future of this devmeeting and FOSDEM PGDay (Magnus)<br />
* Commitfest triage<br />
<br />
== Agenda ==<br />
<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:15<br />
|Welcome and introductions<br />
|Daniel<br />
<br />
|- <br />
|09:15 - 09:45<br />
|Report from SQL working group<br />
|Peter E<br />
<br />
|- <br />
|09:45 - 10:15<br />
|The future of this devmeeting and FOSDEM PGDay<br />
|Magnus<br />
<br />
|- <br />
|10:15 - 10:30<br />
|CoC committee report<br />
|Vik<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 11:00<br />
|Coffee break<br />
|<br />
<br />
|- <br />
|11:00 - 11:30<br />
|Update from packagers<br />
|Devrim<br />
<br />
|- <br />
|11:30 - 12:00<br />
|CI build feedback/information directly in CF app<br />
|Daniel<br />
<br />
|- <br />
|12:00 - 12:30<br />
|Overflow slot, other topics<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:30 - 13:30<br />
|Lunch<br />
|<br />
<br />
|- <br />
|13:30 - 15:00<br />
|Commitfest Triage<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:00 - 15:30<br />
|Tea break<br />
|<br />
<br />
|- <br />
|15:30 - 16:45<br />
|Commitfest Triage<br />
|All<br />
<br />
<br />
|- <br />
|16:45 - 17:00<br />
|Any other business, plans for next year<br />
|Daniel<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
|<br />
|}<br />
<br />
== Minutes ==<br />
<br />
(minutes by Peter Eisentraut)<br />
<br />
Attendees:<br />
<br />
* Joe Conway<br />
* Dmitry Dolgov<br />
* Peter Eisentraut<br />
* Vik Fearing<br />
* Andres Freund<br />
* Stephen Frost<br />
* Devrim Gündüz<br />
* Daniel Gustafsson<br />
* Magnus Hagander<br />
* Bruce Momjian<br />
* Julien Rouhaud<br />
* Tomas Vondra<br />
<br />
(Álvaro Herrera was registered but did not attend.)<br />
<br />
=== Report from SQL working group ===<br />
<br />
Peter gave report from activities in SQL working group:<br />
<br />
* SQL/MDA was published. Some people were interested in learning more.<br />
* current schedule: SQL:2021 in July 2021<br />
* could align this with PG14<br />
* will contain new JSON data type, integration with existing PG functionality to be determined<br />
* otherwise most focus is on GQL and SQL/PGQ (graph database)<br />
* There will be a committee draft (CD) soon, opportunity for feedback.<br />
<br />
Anyone with questions can contact Peter.<br />
<br />
Question about streaming: It is being talked about in SQL WG, but it's not likely in SQL:2021.<br />
<br />
=== The future of this devmeeting and FOSDEM PGDay ===<br />
<br />
magnus: Do we want to keep the dev meeting, PGEU is happy to keep it if wanted.<br />
magnus: Change to unconference?<br />
stephen: Should we discuss that at the end of the day?<br />
peter: It would be nice to have a dev space at each conference.<br />
andres: This meeting dooesn’t have quorum to decide anything.<br />
stephen: People wanted to make the PGCon dev meeting longer.<br />
stephen: PGCon meeting format was changed last year, most people seemed pleased with the outcome.<br />
stephen: Who should be invited? Do we all need to stay all day?<br />
peter: Open up the invitation, with a formal call for participation. Using the PGEU conference system<br />
seems possible.<br />
bruce: Is it a set of committee reports, is it a steering meeting, what is the format?<br />
magnus: We could dial in people as necessary (patch authors).<br />
magnus/stephen: Should different dev meetings have different focus?<br />
[Lots of discussion around last year’s PGCon invitation list. Most<br />
people had concerns about how the change was intransparent and done<br />
too late, not necessarily with the way the meeting turned out.]<br />
peter: Run it like a conference, with an org committee.<br />
stephen: We then still have to decide what the purpose of the meeting is.<br />
tomas: There should be a formal group and formal criteria of how people are invited, it has to be transparent<br />
and public.<br />
[lots of more discussion]<br />
<br />
'''Action''': Stephen will gather a committee (Vik and Andres also<br />
volunteered, and he will ask Dave Page) to be in charge of the meeting<br />
in a public way.<br />
<br />
''coffee break''<br />
<br />
=== CoC committee report ===<br />
<br />
Vik reporting<br />
<br />
* CoC 1st anniversary<br />
* Vik and Lætitia stepped down to rotate in new members<br />
* new: Carole Arnaud, Umair Shahid<br />
* one active investigation<br />
* plan for 2020: develop guidelines for conference organizers<br />
<br />
bruce: Has the committee considered whether the committee itself turned out beneficial?<br />
vik: Not specifically but there has been informal feedback that the existence makes people feel safer/better.<br />
bruce: The committee makes it easier to introduce diversity into the community.<br />
[various discussions on the nuances on the CoC process and committee, also about having external people on the<br />
CoC committee]<br />
stephen: feedback: There are communication/handover problems between sysadmins, core, CoC committee.<br />
<br />
(follow-up discussion: Bruce will take up the last item.)<br />
<br />
=== CI build feedback/information directly in CF app ===<br />
<br />
at 11:30 Thomas Munro was dialed in from New Zealand<br />
<br />
daniel: Integrating cfbot into commit fest app would take workload off CFM.<br />
thomas: It runs on FreeBSD with jails and ZFS, how to move to pginfra (Debian).<br />
[Thomas seemed happy for pginfra to take over the hosting of this.]<br />
[Some discussion on the technical details. Both Stephen and Thomas and the rest agreed that it's doable.]<br />
<br />
'''Action''': Stephen will set up VM, Thomas will port to Debian, Magnus will also participate.<br />
<br />
thomas: There is limited capacity on the free tiers of Travis and AppVeyor.<br />
[discussion about getting sponsorship for paid tiers.]<br />
daniel: curl project gets free capacity from AppVeyor.<br />
<br />
'''Action''': Stephen will reach out.<br />
<br />
andres/peter: We should commit .travis.yml etc. into core. Peter will follow up.<br />
<br />
=== Update from packagers ===<br />
<br />
Devrim reporting<br />
<br />
* He wanted to define a process for communicating with packagers, but didn't get it done yet.<br />
* Getting extensions ported to new major versions is a significant problem. The packager is usually the person users end up complaining to.<br />
<br />
devrim: perhaps a meeting with extension authors (pgcon, pgconfeu), make sure extensions are up to date with<br />
new majors<br />
devrim: doesn’t want to ping people; should have a status page of some kind<br />
should we kick out extensions from packaging that are not updated to major releases? — doesn’t fix the problem,<br />
users are still left unable to upgrade<br />
<br />
''lunch''<br />
<br />
[packaging continued]<br />
devrim: wants to find a better way to communicate with extension authors<br />
peter: maybe a wiki page about which extensions are updated<br />
peter: Christoph [Berg, Debian packager] appears to have the building automated and sends bug reports to<br />
extensions.<br />
devrim: Even if the bugs are fixed, no updated versions are formally released.<br />
magnus: maybe test extensions against master continuously<br />
<br />
Devrim will talk to Christoph about being more efficient.<br />
<br />
=== Contributors recognition team ===<br />
<br />
Vik reporting<br />
<br />
This team was launched based on last year's meeting<br />
[[FOSDEM/PGDay_2019_Developer_Meeting]]. The team consists of Vik,<br />
Stephen, Dave Page, Robert Haas.<br />
<br />
vik: feels not heard, disputes about recognizing not just code contributions<br />
stephen, vik: Team is not effective. Team cannot get agreement on what the rules are.<br />
peter, vik: Remove the contributors list on the web site altogether.<br />
andres, daniel: This list gets people jobs.<br />
vik: Make it just developers?<br />
stephen, magnus: We should have a list of team members on web site.<br />
vik: doesn’t want to be on the committee<br />
stephen: Committee requires unanimity to decide anything.<br />
stephen: The committee has no guidance what the rules should be.<br />
peter: I thought the committee was supposed to decide the rules.<br />
bruce: We need to decide whether to do it right or remove all non-code contributors.<br />
peter: Core team needs to take up action; give policy guidances to committee.<br />
<br />
'''Action''': Core team to decide on contributor recognition rules and give that to the committee to implement.<br />
<br />
more discussion on putting a list of teams on the web site, fed from mailing list memberships<br />
Do we need to ask for consent? Do we want to show that for all teams?<br />
Will we require real names? -- That's a different discussion.<br />
<br />
''tea break''<br />
<br />
=== Commitfest Triage ===<br />
<br />
Bruce reporting on transparent data encryption (TDE)<br />
<br />
bruce: TDE having calls every two weeks (Bruce, Joe Conway, HighGo, others), email thread had too many<br />
contributors with limited understanding<br />
bruce: We will produce green field patches, for PG13 key management system only.<br />
bruce: Wiki page is up to date: https://wiki.postgresql.org/wiki/Transparent_Data_Encryption<br />
tomas: thread is confusing, need to reduce scope<br />
andres: concerned about system impact<br />
bruce: https://commitfest.postgresql.org/26/2196/ is current<br />
various discussion on technical details<br />
andres, others: Wiki page should be cleaned up and actual implementation plan added for review by hackers.<br />
<br />
The meeting then reviewed some other patches in the commitfest app.<br />
<br />
=== other ===<br />
<br />
peter: thoughts on release management team?<br />
andres: last year team had no time zone overlap, bad<br />
peter: thoughts on PG12 release quality?<br />
andres: we need better (not necessarily more) tests<br />
try codecov.io (make available in cfbot?)<br />
add a “Tests” topic to cf app<br />
<br />
about this meeting: do it again<br />
andres: maybe consider splitting up into groups, could be done in the same room<br />
<br />
minutes ended 17:10<br />
<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=FOSDEM/PGDay_2023_Developer_Meeting&diff=37571FOSDEM/PGDay 2023 Developer Meeting2023-02-10T08:43:47Z<p>Alvherre: </p>
<hr />
<div>== FOSDEM 2023 Developer Meeting schedule by Time and Room ==<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Madrid<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 8:30-9:00<br />
|Welcome and Introductions<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 9:00-9:30<br />
|Improving index performance<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 9:30-10:00<br />
|Extensions & SMGR<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 10:00<br />
|Coffee<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 10:00-10:30<br />
|TAMs<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 10:30-11:00<br />
|XLog Format<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 11:00-11:30<br />
|Page Format<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 11:30-12:00<br />
|ResourceOwner Patch<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 12:00-12:30<br />
|ICU / Collations<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 12:30-13:00<br />
|Lunch<br />
<br />
|- style="background-color:lightgray;"<br />
|Tue 13:00-13:30<br />
|Improving wait events<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 13:30-14:00<br />
|Extensions & Stats<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 14:00<br />
|Coffee<br />
<br />
|- style="background-color:lightgray;"<br />
|Wed 14:00-17:00<br />
|v16 Patch Triage<br />
|}<br />
<br />
<br />
== Notes/Minutes ==<br />
<br />
=== Improving index performance ===<br />
<br />
Matthias has been working on improving index performance. Concerned about interest in improvement in B-Tree performance. Improve how we use the data on the page. Multiple columns we can sometimes skip the first few columns because they're likely to be equal. Would like to know if it's worth it to continue on that. Responses mainly from Peter G but not others and want to gauge interest. nbtree performance improvements, specialization on .. PGConf.Eu presentation which showed improvements still possible. Sort on index key ranges with min/max index instead of whole table sort. Faster top-N sort with BRIN. Tomas working on this.<br />
<br />
Andres - are these changes attacking the most common performance issues. BTree index performance is more in CPU overhead and less in how data is stored. Builds search keys from scratch and code seems designed for cache misses. Very basic optimizations that we should be doing to improve btree performance. Benefits from how data is stored is constrained by these other issues. We don't keep the block numbers anywhere useful so we have to get-block-number all the time and that is horrible for performance due to cache misses and cache lines.<br />
<br />
Matthias- I see your point, not something I had been looking at when I started working on it.<br />
<br />
Heikki- These are orthoganal changes<br />
<br />
Andres- Don't find the structural changes as interesting due to the CPU overhead, et al<br />
<br />
Matthias- I get that these changes could be done too and would compliment each other.<br />
<br />
Peter E- Initial patches didn't have good performance numbers or clear improvements and so wasn't clear how it was going to help. Selling the patch better would help get interest.<br />
<br />
Matthias- Lots of people have SSDs and fast storage and on single keys and in those cases these patches don't really help. Much of the work is making sure that these cases don't degrade while improving the multi-key cases.<br />
<br />
Heikki- Are these useful on their own?<br />
<br />
Matthias- Yes, the patches are useful on their own. Improvement in multi-key indexes while not degrading the default case. Makes it complicated.<br />
<br />
Jeff- What was your motivation to work on this?<br />
<br />
Matthias- We had a really large index across three columns which was really slow at a prior company to do lookups. We couldn't use btree deduplication for $reasons. Was thinking "why is this so slow?" and was largely because the attribute had to be compared for every column but we don't need to do that in every case because we know that the first columns are the same. Improvements seen 10-20% on index insertion and lookup.<br />
<br />
Jeff- That sounds pretty compelling but thread was hard to guess from what was in the patch what the improvements were.<br />
<br />
Matthias- 31 text column index which go into compare path plus one uuid patch which gave 200-400% improvements because we can skip the earlier columns.<br />
<br />
Heikki- You can construct cases which can show the improvement.<br />
<br />
Jeff- Constructed cases aren't very compelling but actual use cases which show strong 10-20% improvement are a much better way to sell the patch.<br />
<br />
Jeff- Which collation provider was being used?<br />
<br />
Matthias- The non-default collation because it's more expensive which helps demonstrate the improvement, but even with the default collation there were improvements.<br />
<br />
Andres & Matthias discussion about better approaches to scanning and constants.<br />
<br />
Matthias- In PG14, Peter G committed some changes to btree where tries to delete items on the page when it does a split to avoid a split. May be able to implement the same for the other index types which could improve performance of those other index types. Don't have time to work on it currently but someone else could work on.<br />
<br />
Andres- Index insertion path improvements by doing a pre-sort which can help a lot. No reason to not do a pre-sort when doing batch inserts. Not able to do it in every case due to triggers and such but in many cases it could be done and would help performance.<br />
<br />
Mark Dilger- What is causing the improvement?<br />
<br />
Andres- Just try it and you'll see the improvement.<br />
<br />
Heikki- The table will also immediately be sorted and so you don't have to CLUSTER. If index could keep track of recent inserts then it could order them and insert them.<br />
<br />
Andres- Like GIN fast insert.<br />
<br />
Heikki- Yes, but better. Could be done pretty simply in the index access method..<br />
<br />
Andres- Not sure doing it in index access method is best as it could be the same code copied a bunch of times, better to do it higher up..<br />
<br />
Heikki- Yeah, better to have a batch insert access method function<br />
<br />
Bruce- Folks are frustrated and sometimes the people working on things may feel like a 'lone ranger'. Is there anything we can do to try to avoid that? Particularly with the more specialized stuff, the farther down you go, the more peoples eyes may glaze over. We encourage people to take on these hard problems but it may seem like people don't care about it, but people really do care about it, a lot of people are interested and don't feel that people don't care. Anything we can do to try to improve on that?<br />
<br />
Heikki- Some folks have said that working on PG can have an impact on mental health due to frustration of working on PG.<br />
<br />
Bruce- Not a topic here but feels like something we could work on improving.<br />
<br />
Matthias- The commitfest topics and patch names aren't very descriptive of what expertise is needed to review the patches. Maybe could add "area" such as "indexes" or "access methods" or such to the commitfest system.<br />
<br />
Peter E- How are those defined, could we change them?<br />
<br />
Magnus- Might be able to be modified by CF Admin or superuser.<br />
<br />
Peter E- Always wished for adding categories because lots of patches end up under Misc.<br />
<br />
Matthias- I don't mind the topic part but also need the distinction of what part of the code is being modified and what expertise is needed.<br />
<br />
Peter E- How would you do it? Maybe tags?<br />
<br />
Matthias- Maybe general areas would help.<br />
<br />
Andres- Doubt the commitfest app is where the issue is here.<br />
<br />
Matthias- Maybe could make it easier to find patches.<br />
<br />
Peter E- Have to sell your patches.<br />
<br />
Heikki- Problem of contributors getting frustrated and going away. People post patches they feel are brilliant but then they don't get any feedback on it. Or people put time and effort and then don't hear anything for months and then the result is that we don't want it.<br />
<br />
Mark- I'm also working on index improvement but going at it in a different direction, changing heap code and not index code to make improvements and so it isn't clear if there's an overlap there or not.<br />
<br />
=== Extensions & SMGR ===<br />
<br />
Matthias- At company we would like to be an extension and not a fork but we bind deeply into the SMGR APIs. How well would PG be accepting of SMGR available to external users to avoid having to be a fork.<br />
<br />
Peter E- Way back to 6.0 we had this<br />
<br />
Matthias- We ressurected that<br />
<br />
Peter E- Also looked into this and just have to come up with a way to do it but everything is today hard-coded. Maybe like tablespace with a local override or something.<br />
<br />
Andres- Has to work at all times including in an inconsistent state because we use it during recovery and therefore can't look at catalogs, et al. Worried about code complexity for core PG where PG is just making things easier for forks or other projects but making it too complicated for core without any benefit.<br />
<br />
Heikki- Lets imagine opening up smgr read and smgr write<br />
<br />
Peter E- That or md.c functions?<br />
<br />
Heikki- Does it make a difference?<br />
<br />
Andres= Could use compiler flags to override that using linker magic.<br />
<br />
Mark- Change the whole cluster?<br />
<br />
Heikki- Yes, across the whole cluster.<br />
<br />
Mark- Can't really do that from an extension then because you need to init the DB and load the extension first.<br />
<br />
Andres- You would only be able to do this at initdb if for the whole cluster<br />
<br />
Heikki- Also use md.c for temporary files and things instead too.<br />
<br />
Andres- How are you handling figuring out when to use what?<br />
<br />
Heikki- Hack in smgr.c to pass in a flag to indicate the kind of table<br />
<br />
Matthias- Passed in the type of relation to then control how we access the files<br />
<br />
Heikki- Or we could have it a level above to specify the functions to access<br />
<br />
Mark- This is for storage as a service, so you could just swap out the whole storage manager<br />
<br />
Matthias- We are modifying the functions that call into the storage manager to call our hooks so we don't use the smgr storage structs<br />
<br />
Heikki- We have hacked to pass in what kind of table it is<br />
<br />
Jeff- Could we do it at tablespace level instead<br />
<br />
Matthias- We don't currently force all temporary tables to a particular place<br />
<br />
Heikki- May have an issue with what Andres said about being able to recover before being able to do catalog lookups<br />
<br />
Andres- yeah, not sure that tablespace makes sense for this.<br />
<br />
Heikki- Other things that you could do, maybe encryption or compression, not sure if that would work. Another cool thing would be to do something with backup to fetch the data on-demand from like pgbackrest.<br />
<br />
Peter E- Maybe instead of storing on disk, store out in the cloud somewhere. Not easy to play with the changes since there isn't a good API<br />
<br />
Jeff- What about the non-smgr file access that's going on.<br />
<br />
Heikki- All of the SLRUs, work around that by not doing things with those and those are still stored locally. Would be cool if they also went through smgr API.<br />
<br />
Mark- How would that work? Multiple nodes connected..<br />
<br />
Heikki- When you start up PG, you need a base backup to start from with control file and other little things and the SLRUs. Just restore those as-is. SLRUs are pretty small so that is ok. All of the relations are stored in the cloud and get accessed through the SMGR API. Only one writer to handle the sync issues.<br />
<br />
Mark- This isn't multi-master?<br />
<br />
Heikki- No, this isn't multi-master. All the table and index accesses go through the SMGR API. There are some places where we have to modify the code for index builds. One thing causing us trouble is some writes through SMGR are not going to be WAL logged. GIST index builds the index first and then WAL logs everything after. smgr write for us is a no-op and then on read we reconstruct from WAL. Maybe add assertion that everything is WAL logged to make sure all modified pages are actually WAL logged.<br />
<br />
Andres- Hint bits?<br />
<br />
Heikki- Think we just throw those changes away now but we could possibly do better.<br />
<br />
Andres- Sounds like you need to do the work to propose a patch and then we can see if the complexity is really bad or not.<br />
<br />
Matthias- There is not zero external interest in this.<br />
<br />
Mark- Sounds interesting and we would be interested in seeing a patch for it.<br />
<br />
Heikki- One thing that bothers me about md.c and smgr.c- the way they treat the relation forks is kinda ugly. md.c needs to know about all the forks and there's an array for the segments and forks and segments on forks. Would be better if md.c didn't have to deal with that and instead dealt with one fork at a time. smgr.c could maybe deal with forks.<br />
<br />
Andres- That doesn't seem quite right. You would have to somehow group the data together. Looked at that problem at one point and maybe we could have completely different relfilenodes for forks instead and have one fork for main relation and a different relfilenode for the VM would make things simpler.<br />
<br />
Peter E- Set of forks are hard-coded in a lot of places.<br />
<br />
Heikki- Larger relfilenode we could do that<br />
<br />
Matthias- 56-bit relfilenode would allow us to have room for that<br />
<br />
Peter E- What about other kinds of forks that other kinds of storage might want to add, maybe as an extension or maybe not, not everything may need the free space map, etc.<br />
<br />
Heikki- Working on column-store before would like to have a separate relation fork or something like that for each column.<br />
<br />
Jeff- Might be harder to do that but at least having more than one ... there's a difference between allowing a few extra forks vs. allowing potentially hundreds.<br />
<br />
Andres- But what about init forks which are weird magic, may have to come up with something different for that. Init forks maybe a separate directory then you wouldn't have to iterate over everything and could work and would allow to have initial data.<br />
<br />
Peter E- forks are used by different parts of the system such as init forks being hooked into the crash recovery system<br />
<br />
Heikki- init fork change would be good to do independently.<br />
<br />
Andres- had a patch to allow you to associate a given relation with multiple relfilenodes in pg_class but didn't quite get it all done<br />
<br />
=== TAMs ===<br />
<br />
Mark- As Andres mentioned, not want to only accept things into core that are for forks. Chicken and egg when it comes to TAMs though. More baked into core that we have heaps, lots of code doesn't talk about tables but they talk about heaps. Lots of assumptions. Mainly interested from the group in trying to address the chicken and egg issue when it comes to TAMs or to see what the better way to go about it is. Company worked on zHeap for a while and was excited but not sure if it's ever coming back. Tendancy to think of new TAMs as where devs go into the wilderness to pass away. Two TAMs developed and released to solve specific performance problems in production. Working on new TAMs with different on-disk formats. Anyone else working on TAMs is likely to want this too. How to share these improvements in core for these other TAMs. There are some functions that call the heap functions directly still and don't go through the TAMs. If we put a different TAM out but when these differences exist and something accesses directly through heap functions then you end up with bugs. Created the Contrary AM to store everything differently on disk intentionally. If you use Contrary AM, you can find where functions which access the heap directly will explode and break. Happy to contribute that back but not just that, could write a TAM which will provide better performance but which aren't heap. Should we have a contrib module to make sure that the TAMs don't get broken.<br />
<br />
Heikki- TAMs should be how everything access, but those are bugs if they access the heap directly and should be fixed.<br />
<br />
Mark- Extensions exist out there or forks which access the heap directly and that's where the bugs are.<br />
<br />
Andres- For a beginscan could check that what was given is actually a heap and so there should be a catch happening but if we don't have that then we could add that.<br />
<br />
Jeff- Could we add preprocessor magic to check if someone is calling the heap functions but throw an error if they are being called from not through a TAM that could maybe throw an error during compile time maybe.<br />
<br />
Andres- Not sure we could do that and that might cause more harm than it would help.<br />
<br />
Heikki- What extensions are doing that?<br />
<br />
Peter E- pglogical is doing that.<br />
<br />
Heikki- Why does it do that?<br />
<br />
Peter E- Fixable but could build more scaffolding to avoid new extensions doing that.<br />
<br />
Alvaro- How much is there between heap and the Contrary AM?<br />
<br />
Mark- When you have a loadable module you expect in minor upgrade for those modules to continue to work, don't expect that the module is updated at the same time as the core code. If you copy the heap code and make it pile and add it as a contrib module as a TAM but identical to heap. If you upgrade from 15.0 to 15.1, it should be fine, but you don't know for sure and the community may change the heap code in a way that changes things that aren't compatible. Maybe we could have something in contrib which runs in the buildfarm to detect such changes.<br />
<br />
Andres- What minor version changes has this happened in? We've only had like three of these recently, just those?<br />
<br />
Mark- Yes, just those.<br />
<br />
Heikki- Is this just for contrarian AM or are there real cases where the heap changes in minor versions have caused issues?<br />
<br />
Peter E- May only happen once a year or once every three years but when it does happen it's very tramatizing...<br />
<br />
Mark- We have several customers with heap performance problems and they continue to ask for fixes. Index bloat due to holding open a transaction and lots of updates, lots of tuple versions across pages and lots of index bloat. Make a solution to that problem and give it to the customer, you're very nervous that this new AM won't survive minor version updates. Would like a better guarantee that this won't be an issue in minor version upgrades.<br />
<br />
Andres- Don't recall a case in core where we have had an issue with this and sounds like isn't a core issue really but is an issue in extension. Not obvious how we could test for this in core because it's an issue in something external. Maybe something that tests for signatures.<br />
<br />
Mark- Does anyone else that wants in TAM... want to spend maybe a year working on a new TAM to allow push-down and contribute to the community. Others interested in predicate push-down stuff?<br />
<br />
Jeff- Yes, interested in that.<br />
<br />
Andres- Heap could be interested in this too<br />
<br />
Heikki- A few things left in core that assume there is something like heap. ANALYZE assumes you have blocks. Would be nice if there was a function in the TAM API. Was expecting that to be brought up.<br />
<br />
Peter E- Not really going into fully different things with this- things are still blocks.<br />
<br />
Heikki- bitmap heap also assumes blocks still<br />
<br />
Mark- If you take a bit of space to say what kind of page it is then you could have your scan skip blocks that aren't interested in that. Don't need an API change for that really. Will TABLESAMPLE land on the wrong kind of block maybe, is that an issue?<br />
<br />
Heikki- That seems like it may be an issue, yes.<br />
<br />
Andres- Some pretty easy API changes for bitmap heap scans but after that you're going to have to have something that's block shaped.<br />
<br />
Heikki- With bitmap heap scan, the problem is that it degrades and becomes lossy and you have to scan all tuples on a particular block.<br />
<br />
Matthias- Becomes lossy quite quickly, could imagine TAM that has many many line pointers per page / TIDs, so bitmap heap scans likely to always be lossy which could be pretty annoying.<br />
<br />
Andres- Someone working on patch to use radix for vacuum and that could maybe be used for bitmap heap scans, not a small change.<br />
<br />
Matthias- Not sure if happy with this part of TAMs, but also a patch to do batch inserts for things like insert into select<br />
<br />
Mark- Would be fantastic. Some of the code about assuming the number of tuples per page doesn't really work because I don't want to store the header over and over again, end up sorting the data as it comes in to improve performance and improve compression.<br />
<br />
Andres- Is that a question of TAM or executor code?<br />
<br />
Matthias- Both, right now when we batch insert we only do in copy, only happens with full batch inserts. Patch proposed to make possible to do batch insert in TAM and allow to buffer for later insertion and could use that to reduce WAL size and improve performance of compression. Old patch that hasn't been updated in quite some time, about a year ago. New TAM for multi and single inserts is the patch.<br />
<br />
Jeff- A lot more infrastructure could be provided for conditional push=down around things like parameterization. Ideally a TAM would advertise the columns that would be interesting for parameterization and then the planner could generate and cost those paths. That could be a combinatorial problem explosion, the planner would need to handle pruning that. Would be useful infrastructure. Can be done with custom plans now but think that a lot of TAMs would want this and therefore would be good common infrastructure for that.<br />
<br />
Heikki- Does that predicate push-down make sense...<br />
<br />
Jeff- Rather than do the predicate push down, let it return some other structure to the executor and let is handle it there? I think that's a good point but it seems like the horizon for actually making that fully generic is pretty far out there. As an extension author, just trying to get something running, you'd have to do that with a custom scan.. You want to do some simple predicate push-down you have to invent a whole bunch of things in a custom scan and that's a pretty long path when you may have a simple structure which would let you rule out a bunch of rows that aren't interesting to scan very quickly. Instead, being able to say parameterize me and provide that path to the executor would be simpler/faster to have.<br />
<br />
Andres- Possible issue with taking out exclusive locks in part of this but we could probably work around that to make it better and possibly provide a substantial speedup.<br />
<br />
Jeff- If we rearrange some of this, we may be able to rework how index parameterization is done. Don't have a lot of details there but essentially if you have a TAM with predicate push-down, looks a lot like a nested loop index scan from the point of view of the planner.<br />
<br />
Heikki- Could we use the exact same scankey infra for TAM that we do for index scans..?<br />
<br />
Andres- Today it's different<br />
<br />
Heikki- but you'd want it to be the same<br />
<br />
Jeff- Don't have a way to do that today<br />
<br />
Heikki- Have a heapscan key today in the TAM but we don't really use it today.<br />
<br />
Mark- If you ask for all rows where ID=5, you go to the index to get it, you don't really ask that of the TAM. There's an opportunity there for improvement if we also pass that through the TAM to eliminate things that the TAM has to do.<br />
<br />
Jeff- Thinking of the TAM to do less, would have to write a lot of custom scan code but could be done with custom scan node and TAM today, all doable but it should be common infra.<br />
<br />
=== XLog Format ===<br />
<br />
Matthias- Complaint- it's large. We have 44 bytes to just change 1 page. 24 bytes are xlog header, remaining bytes used in determining the actual page and related overhead. Don't think we should have that much overhead for changing a page. There are changes we could make to reduce that overhead if we are willing to make those changes. A few discussed before on the lists. Looking at what changes we actually can make, for instance- transaction IDs are included but few cases where we actually need transaction ID. Indexes are not aware of transactions and don't use transactions, maybe we can eliminate transaction IDs from index updates. Length of record currently uses 4 bytes but records themselves may need less than a byte to store that, and should be able to reuse the rest. Potential problem there is that the decoding may become fairly expensive.<br />
<br />
Peter E- Send a patch!<br />
<br />
Matthias- Would have but have run into issues with decoding/unpacking of struct. When record is split across pages because it's too large or just didn't fit, right now we copy the whole record into a separate buffer that is allocated and then we checksum and then we decode it. For large records we have overhead of reading the record twice. Very expensive. Should be able to do decoding and checksumming in one pass, many opportunities to figure out that the record doesn't fit and the checksum at the end is good to validate the record. Don't need to checksum the full record before we start decoding.<br />
<br />
Heikki- All comes down to performance, send a patch and benchmark it. More principaled questions- Do we need to store the XID on every change for $reasons like debugging perhaps?<br />
<br />
Alvaro- Used by pg_rewind?<br />
<br />
Heikki- No<br />
<br />
Matthias- Not every xlog record gets generated in backends that have local transaction ID<br />
<br />
Andres- today we have generic handling of it but changing that means we need to make sure we copy transaction ID to everywhere that needs it<br />
<br />
Heikki- Would it be ok to compress the WAL itself? Do some kind of fields to the page header for the relfilenode so that it is only stored once instead of every time, or maybe reference previous WAL record, to avoid having to save the relfilenode over and over.<br />
<br />
Andres- Locking for that would be awful and adding intra-record stuff would be bad. Maybe reference the page header for multiple records within a page and could do that opportunistically maybe<br />
<br />
Matthias- We build the record before we know where the record is going to go<br />
<br />
Andres- already able to modify the checksum and we could change where it goes<br />
<br />
Heikki- This could change the size though is very different from calculating the CRC because impacts records after. Comes down to the xlog insert path as that has to be highly concurrent, assuming we could make it work we could do it. Kind of wasteful that we store CRC, et al, on every record but then we flush the records in larger buffer, maybe we should have a frame or page that contains multiple records.<br />
<br />
Peter E- but then you still have to go back, concerned about the size or..?<br />
<br />
Heikki- People concerned about size yes, though mostly about FPIs, but still<br />
<br />
Peter E- Decoding speed vs insert speed and size<br />
<br />
Andres- Record length variable width maybe... pg_waldump --stat was pretty reasonable on a big workload with variable record length, could optimize that further. If the data can be organized to make decoding cheap then could help but we have such low-hanging fruit like we store 4-byte integers where 3-bytes could be used, but just generally reducing alignment losses, et al. WAL was more than 60% zeros or something like this.<br />
<br />
Matthias- there are some alignment places where there are zeros..<br />
<br />
Heikki- We could possibly just compress the WAL when we write it.<br />
<br />
Matthias- Compress record or stream?<br />
<br />
Heikki- Compress the whole stream.<br />
<br />
Andres- Constantly end up flushing the last page multiple times which is pretty bad for OLTP workloads<br />
<br />
Heikki- Have to have an append-only compression algorithm<br />
<br />
Peter E- If you do that then you can only compress the flush size and that may not make sense<br />
<br />
Heikki- Not as good as compressing the file afterwards but still could be better<br />
<br />
Alvaro- Proposing using compression to avoid improving the base layer of how we write WAL<br />
<br />
Heikki- Can be an effecieint way to address it<br />
<br />
Andres- One thing reminded me of for AIO, for network storage and slow storage, partially filled pages can't be overwritten concurrently and that becomes an issue. Have multiple IOs for different pages concurrently but can't do that for last page in WAL. Got performance improvements by using new pages constantly to parallelize it but that blows the WAL up really big so isn't good but does improve performance. Don't really see how to improve with the way partial pages happen. Reduce the block size to 4k instead of 8k as 8k doesn't give us any benefits and makes it much more common to have partial pages and on the filesystem typically have 512 or 4k granularity and so you just add overhead with 8k bytes and read/modify/write cycles.<br />
<br />
Heikki- Would be great to get rid of the WAL header entirely..<br />
<br />
Peter E- Only drawback is that you have more records split<br />
<br />
Matthias- Still need the segment header but that's ok<br />
<br />
Heikki- Think you're replying to me and just talking about the size change and yeah<br />
<br />
Andres- if you start to write WAL that's not page-aligned then performance suffers really badly.<br />
<br />
Heikki- Makes the decoding and encoding of WAL more complicated as you have to keep track of those headers<br />
<br />
Andres- You can't read as randomly from the WAL because you don't know where things start and end, but wouldn't know that without the header. Two-phase commit stuff ...<br />
<br />
Heikki- Reducing the page size and just flush have a page rather than full page and get same benefit?<br />
<br />
Tomas- You will still modify the page multiple times and dirtied the page multiple times<br />
<br />
Heikki- Keep xlog size at 8k but at xlog flush then flush at 4k<br />
<br />
Peter E- Then you introduce a different idea of what a page is<br />
<br />
Andres- Don't see a real advantage to having 8k page size, doesn't seem advantagous.<br />
<br />
Heikki- You're saying it's useful to have 8k page size WAL?<br />
<br />
Andres- No, I don't see it being useful at all<br />
<br />
Heikki- As far as I'm concerned maybe we should have 32k or 16MB page size instead of having it be smaller to reduce page header overhead<br />
<br />
Andres- Able to determine end of WAL more easily<br />
<br />
Matthias- When we recover to a broken record, usually we expect it to be because a split page write and that's where we know the WAL ends. Kind of important because as was mentioned in 56bit relfilenodes, we might write WAL all the way back to a prior flush<br />
<br />
Andres- Would be pretty bad to have to reflush a 1GB segment<br />
<br />
Matthias- Right now the page header the wrong LSN will be seen<br />
<br />
Heikki- If it's not a recycled earlier segment but is a new segment...<br />
<br />
Stephen- Comments about larger field for AES GCM auth tag.<br />
<br />
=== Page Format ===<br />
<br />
Matthias- Right now we have normal page format which is used and available for various AMs. There are ideas about TDE which want to change the format to reserve some space for auth tag or other things like extended checksums. I think that space should not be at the exact end of the page while we are in memory, don't care about it when on disk, but in-memory that is prime real-estate.<br />
<br />
Discussion about page changes to store extended checksums or 64bit XID or auth tag.<br />
<br />
Tomas- 4k pages can greatly improve performance in OLTP workloads<br />
<br />
Andres- Case where low-cache hit ratio, shared buffers isn't enough to fit things<br />
<br />
Tomas- yes, in cases where shared buffers isn't large enough<br />
<br />
Andres- Might be more a factor of line pointers distance to the tuples<br />
<br />
Matthias- Shouldn't be an issue if huge pages are being used<br />
<br />
Andres- If huge pages used then yes but with prefetching there's heuristics that may not work if it's too far<br />
<br />
Tomas- When it fits into shared buffers it doesn't seem to make meaningful difference.<br />
<br />
Andres- That makes sense and is mainly write volume and not specific to SSDs really<br />
<br />
Tomas- SSDs have erase blocks but they're split into pages but the faster you write the faster you write into erase blocks, generate more work. Definitely it's a combination of multiple parameters.<br />
<br />
Andres- Dirty write-back with SSDs where they go back and write out blocks in a row to avoid having to make the system read in a page and then write it out can make a huge difference.<br />
<br />
Peter E- How can we get more people to use checksums?<br />
<br />
Andres- Right now there's a big performance hit from using checksums in some cases because WAL logging of hint bits causing a performance hit.<br />
<br />
Matthias- Idea to reduce hint bit writes in the WAL from changes with checksums enabled. There is no split page potential? Even if there is a torn page, you still redo the record.<br />
<br />
Heikki- Partial full-page?<br />
<br />
Matthias- Yes..<br />
<br />
Andres- Typically is going to be a full-page change anyway so doesn't really help<br />
<br />
Matthias- For freezing which is a common case, changing a lot of things on the page from visible to frozen, just changes bits, the meaningful bits aren't being changed<br />
<br />
Heikki- Point is when you write the WAL record, you'd have to say modify this bit or that bit<br />
<br />
Matthias- organization of the page doesn't change<br />
<br />
Matt- Who is we<br />
<br />
Matthias- the information on the page, while freezing we aren't changing the data of the page, whatever torn bytes there are going to be aren't changing the bytes that are meaningful. Reason for torn page protection is that the line pointer array may change and the tuples may be getting changed but if we don't change where the tuples are then a torn page shouldn't be an issue<br />
<br />
Andres- not sure that that's true, maybe for hint bits but not for freezing<br />
<br />
Matthias- even for freezing, it's only sets of bits that are being updated, fairly certain that we can improve on this. Think there are places where maybe we don't need to push out a full page image.<br />
<br />
Andres- Don't think it's enough, still need to do a full page write or need a second LSN<br />
<br />
Matthias- We do have space in the page header bits, only have a couple of bits used<br />
<br />
Andres- As soon as you have checksums this is all gone<br />
<br />
Heikki- This is for checksums too, you do a partial full-page write that's not as large, but seems hard to pull off<br />
<br />
Matthias- Yes, extremely difficult but not impossible.<br />
<br />
Heikki- What if we stop writing those hint bits?<br />
<br />
Andres- the performance hit is very large if you just don't write them at all<br />
<br />
Peter E- Vacuum-ish kind of process that does it sometimes still<br />
<br />
Andres- the SLRU lookups kill you immediately in terms of performance<br />
<br />
Jeff- Maybe a tiny cache that would help?<br />
<br />
Andres- Maybe could win a lot with a tiny hash table cache. Could cache xid to parent to win a lot to help with subtrans too.<br />
<br />
Heikki- Have a better SLRU system and cache but isn't going to be as good as hint bits on the tuple<br />
<br />
Bruce- Have a scratch space for a table, always have an extra dead page in the table, instead of writing page 10, you write it into the dead space and have that space in the table. The reason we have trouble is because we can't go back to the old version of the table.<br />
<br />
Andres- Then have really hard problems with possibly returning double tuples<br />
<br />
Heikki- So whenever you write the page to disk you write to double-buffer area and it's an alternative which has downsides but is possible<br />
<br />
Andres- No need to log hint bit changes immediately and so maybe we could batch them and reduce the xlog overhead and WAL logged hint bits<br />
<br />
=== ResourceOwner Patch ===<br />
<br />
Heikki- Have patch to change the way they work internally and make them useful in extensions. Not using them currently in any extensions but for others to use if it's useful. Patch made some changes to how ResourceOwners work. Objections from Andres- with Heikki's patch resources are released in random order. May have made exception for locks.<br />
<br />
Jeff- Motivation?<br />
<br />
Heikki- pgcrypto wants to track some things in ResourceOwners, when wrote that code was very painful because couldn't use them. There's callbacks but they're really difficult to use. Hard for an extension to leverage ResourceOwners from extension. In core we have 10-15 uses for ResourceOwners and there's a lot of boiler-plate code that could be eliminated.<br />
<br />
Andres- Performance regression due to ResourceOwner getting bigger which isn't good<br />
<br />
Heikki- Hard part of patch was to keep performance good because of ResourceOwners being in critical path. Objection- released in current code is in a specific order but with the patch they get released in random order.<br />
<br />
Alvaro- Why do you care?<br />
<br />
Andres- There may be some dependencies in there and error handling needs to mark the page and has to happen before un-pinning the page during io.<br />
<br />
Alvaro- Can we create more phases<br />
<br />
Heikki- We were discussing that and maybe having a priority number or such. Not convinced that should be necessary. Would like to take a look at where that's being done. Second objection, if you need to remember a resource in a critical function you have to first call an enlarge function to reserve a slot for it. That mechanism is per resource-kind currently. Changed it so that there is just ResourceOwner enlarge instead of being per resource-kind. Difference is that if for some reason you want to reserve one buffer pin and one tuple descriptor and then you enter critical session that doesn't work because with patch can only allocate one slot. Argument is that you should keep the distance between reserving the slot and using the slot should be very small because it's already very dangerous to do this because of other calls you might end up using one by accident. New patch just reserves one slot instead of having one slot for each kind.<br />
<br />
Peter E- You could just allow reserving more than one<br />
<br />
Heikki- Yes, if you know how many you might need then you could do that. If there's any serious code between the reservation and using the slot, it's very hard to be sure.<br />
<br />
Andres- Right now we always increase by power of 2 and that's part of the reason it's hard to find off-by-one errors. Maybe change to have a counter/check to make sure that you aren't going past how many.<br />
<br />
Heikki- Maybe have a way to return what was reserved and then have the use of that pass in the value of which was reserved and throw an error if that's an issue.<br />
<br />
Andres- There should be only a few places that need more than one.<br />
<br />
Peter E- Seems solvable.<br />
<br />
Heikki- Third, for some kinds of resources you could do it differently and instead of having array and hash, keep track of resources in a linked list instead. Some resources have structs and you could use a linked list instead and that could be faster and more performant.<br />
<br />
Andres- That could also make it safer. Ran into this for AIO for WAL insert and every AIO for WAL insert had to be reserved and is in critical section and you can't allocate memory there. Not applicable to all kinds of resources but does work for some.<br />
<br />
Heikki- Maybe linked list approach could be used for basically all of these cases instead since very few cases where there aren't structs. Maybe everything could use structs?<br />
<br />
Andres- Convert most things to list then maybe could be better. Patch by Rowley to get rid of all special cases by using dlist(?).<br />
<br />
Heikki- For buffer pins could we have a local buffer struct<br />
<br />
Andres- Have to allocate it. Could do something like existing resource for hints and just use dlists or lists for everything else.<br />
<br />
Heikki- Yeah, maybe I'll try that approach.<br />
<br />
Andres- Maybe combine dlist approach with patch approach by storing allocation in dlist head so that can store header, dlist head, inside resourceowner unless it's needed and then one or two things that are needed, might be best of both worlds.<br />
<br />
Heikki- I'll play around with that if I get a chance.<br />
<br />
=== ICU / Collations ===<br />
<br />
Jeff- Issues with collations. Right city to discuss it in. One thing is PG is pretty unified and the direction users a guided in in terms of the way things should be done. Integer timestamps are better than float, et al. Should we be doing the same thing with ICU vs libc? Should we make a decision there? Are we not going to express an opinion? Of course, even if we try to not make a decision, if we leave the default as-is or change the default, that's a decision. Would we eventually like to pick one way and go with it, or stay on the fence?<br />
<br />
Peter E- Would like to move towards making ICU be used as the default.<br />
<br />
Andres- Hard dependency?<br />
<br />
Peter E- That's the problem. You can change the initdb default but it'll fail if ICU isn't compiled in. What do you do then?<br />
<br />
Tomas- What about use ICU by default if it's built-in?<br />
<br />
Peter E- Is that a good answer? Already is environment-dependent and so maybe it wouldn't be that different. Might be better as it would be a better default instead of getting it from environment.<br />
<br />
Heikki- Locale itself still depends on where it's running<br />
<br />
Peter E- Want to get rid of locales but that's kind of independent.<br />
<br />
Andres- Just make it a hard dependency?<br />
<br />
Alvaro- Are there other collation providers? Microsoft?<br />
<br />
Andres- Microsoft has ICU available but not the default for things<br />
<br />
Peter E- Collation provider concept, back of mind- there's a native API on MacOS which could have been another choice but there's no practical benefit. Doesn't seem like there's actually a bunch of different APIs, just the legacy one and the ICU one and not really interest in other.<br />
<br />
Dave- What about platforms which don't have ICU? Are there such?<br />
<br />
Andres- Don't think there really are any such. Built PG on a bunch of platforms last year and pretty much all have ICU for a long time. May not be available by default on some systems but it's available. No extra dependencies on MacOS currently to build and some appreciate that.<br />
<br />
Peter E- When on new platform sometimes it's nice to be able to git clone and build and avoid ICU because ICU is big to download and build.<br />
<br />
Heikki- Agreed that isn't great, maybe have an option to not have any collations in that case?<br />
<br />
Peter E- Question is if we want to nudge users to use that stuff.<br />
<br />
Andres- Switch the default to use if available which would at least allow devs to not have to worry about it but generally it gets used.<br />
<br />
Alvaro- Is this something we could do for 16?<br />
<br />
Peter E- What we are talking about right now as it's an easy thing to just change the default.<br />
<br />
Jeff- If we feel ICU is the right thing, we've been using it for a while and we have found some issues with it and it isn't perfect but generally my feeling is that it's a better path than libc and if the project feels that way then we should start nudging people in that direction.<br />
<br />
Joe- Not all of PG locale functionality is handled by ICU. lowercase/uppercase, C-type operations...<br />
<br />
Jeff- lowercase/uppercase do use ICU but there are some scattered cases of other things being used. strxform call when making histograms as an example maybe<br />
<br />
Peter E- Question about how to instrument these things to catch such cases would be good to figure out. tsearch uses it, on list to fix but isn't very interesting.<br />
<br />
Jeff- Those scattered places ... there are details we should work to figure out and address those cases<br />
<br />
Heikki- Even if we don't change the default, we should fix those cases anyway. Is there a reason to not use ICU?<br />
<br />
Matthias- because you're only using the C locale?<br />
<br />
Peter E- Then just say to use that?<br />
<br />
Matthias- But you could build a smaller binary with having just the C lib<br />
<br />
Heikki- Is there a performance reason to use ICU than C lib?<br />
<br />
Jeff- In my tests it's been better<br />
<br />
Joe- There was a regression introduced by the C maintainers where they made a change saying it wouldn't cause a performance issue but it actually does for multi-byte. ICU is faster if you have a lot of UTF8 multi-byte characters vs. libc. Big regression in recent versions of glibc.<br />
<br />
Jeff- If they fix that problem then in theory libc could be faster, but if they don't fix that then ICU blows away glibc.<br />
<br />
Joe- Scattered calls to glibc locale dependent functions in PG core that aren't going to ICU, concern about switching to ICU due to that<br />
<br />
Jeff- In terms of actually what the user sees, should all be handled correctly. The cases pointed out shouldn't be user-facing. A lot of those cases are with libc collation provider and not with ICU, though there were some calls that may need to be looked at. If there are user-facing issues then that's a bug that should be addressed. Assuming we can address the bugs...<br />
<br />
Matt- Independent of performance, a libc upgrade that changes sort order breaks indexes, etc. ICU would make it easier to detect/address that?<br />
<br />
Jeff- No way to change from one collation to another today and so have to keep same ICU version. But, there are potential advantages to using ICU because it's a separated library that you could manage the versioning of instead of being tied to libc.<br />
<br />
Joe- Ins't just indexes. If you have FDWs and running on machines with different versions of glibc then you'll have problems in that case too. Recent case of mysql FDW and they were having problems because a join wasn't working because the collation for mysql was different.<br />
<br />
Matt- Haven't seen a strong advantage to ICU vs glibc because that's the same problem between the two.<br />
<br />
Andres- Not really a nice way to load multiple versions to load multiple versions of ICU, but you could do that more easily with ICU.<br />
<br />
Peter E- Or you could just keep the same ICU version around generally instead of having to upgrade it, like you have to upgrade glibc due to $reasons. We could move things forward at least, not a panacea.<br />
<br />
Joe- Did a project where extracted out of glibc the locale code into a separate library to use and freeze the collation at a particular collation that way. Link PG to that library instead of the actual glibc library.<br />
<br />
Tomas- Wouldn't want to get stuck on one collation as there are improvements which happens. If there's a new glibc version, how difficult would it be to update that?<br />
<br />
Joe- Was able to test it extracting 2.17 and 2.26 and the way extracted was able to work for both. Could be extended to build a different version if needed.<br />
<br />
Tomas- How would that work? We would decide when building a major version?<br />
<br />
Joe- Think it would be something that the packagers would have to handle. Same issue with ICU.<br />
<br />
Jeff- Have working code to allow change ICU library at runtime so users could change to a new version of ICU. Could help users prevent issues with the library changing out from under them. Based on prototypes that Tomas provided earlier. Might not go into 16 but the code works. Also have prototype code which allows doing something similar for libc. Packagers could build against later version and then users would be able to choose version at initdb time. Packagers would then package up multiple versions and make them available concurrently and keep them all forever and users could then choose the one for them and keep it static.<br />
<br />
Tomas- One of the problems we have is people upgrading OS where new server has new glibc and they don't realize their indexes have gotten broken. This would be a solution to that by installing the old compat library.<br />
<br />
Joe- Have to do that before they do anything.<br />
<br />
Tomas- Is there a way to track the version and on server start check the version and refuse to start if needed.<br />
<br />
Joe- PG15 may blert a warning?<br />
<br />
Jeff- That's a different thing, colversion but that's different from the collation library version. Simplest proposal at initdb time could be to pass a flag saying which is needed and then have that track and do initialization and setting up the collation from that provider.<br />
<br />
... further discussion over lunch including about loading multiple ICU versions concurrently, tracking collation version in the catalog, allowing to build new indexes concurrently with existing, et al<br />
<br />
=== Improving Wait Events ===<br />
<br />
Bertnard- Add more details to wait events, for example when buffer content could have relfilenode and other information included. Won't be the same details for each wait event and so we have different data we want to add and the number of different details. Buffer content we might have 3 additional details, for a checkpoint we might have 2 extra items. Store this in session dynamically and then be able to return data from pg_stat_activity or otherwise. Issue with consistency. Currently we have wait event and wait event type with int32 and that is always consistent but if additional information is included then not sure about how to keep it consistent.<br />
<br />
Andres- Issue with overheard too. Wait events added because they're cheap but adding this other info adds a lot of additional overhead in certain code paths.<br />
<br />
Bertnard- Yes, have to consider that.<br />
<br />
Andres- Still have to store all the source data with each different wait event. Not sure how to do that without making it much more expensive.<br />
<br />
Bertnard- Have to see how to measure it, maybe able to make it lossy to address that cost.<br />
<br />
Peter E- How would you expose it beyond just pg_stat_activity. The more detail you add such as how extensions may want to include means you have to consider how to display it. How are people supposed to use it?<br />
<br />
Andres- Depends on the details. On content lock on a btree page is very different from contention on content lock for heap page.<br />
<br />
Alvaro- Is waitevent the best way to store this information. Maybe could write to ring buffer and could from application side read that and build a history of what's been going on instead of having to poll that information.<br />
<br />
Peter E- Could still be the wait event API and just store it in a different place<br />
<br />
Andres- We use wait events in critical sections where you can't allocate things and can't do anything serious because heavy locks are being held. Making wait events much more expensive isn't going to be acceptable.<br />
<br />
Peter E- Is consistency really required in all these cases, maybe we don't need to have it be completely consistent? Maybe just write 4 int32 fields without locking around them all at once and maybe it's fine to do that independently.<br />
<br />
Andres- Have to be careful to not make it vastly more expensive. When you read you need to know if what you read is actually valid or not. You can't read it without knowing if it's actually reasonable or not.<br />
<br />
Bertnard- During 10s you have a bunch of different wait events if some aren't valid maybe is ok but you need to know that it's valid.<br />
<br />
Peter E- Would be really frustrating if you get the wrong data and take action based on incorrect data.<br />
<br />
Bertnard- Is it work it to spend time on this?<br />
<br />
Peter E- Could do some simple tests by making value 8 bytes (maybe larger) and putting a spinlock around it and see how expensive it is.<br />
<br />
Andres- 8 bytes probably fine<br />
<br />
Stephen- 8 bytes not enough though to track this<br />
<br />
Andres- 8 bytes atomic on nearly all platforms these days, isn't on like armv7 but probably not a big deal there<br />
<br />
Peter E- Is 8 bytes enough?<br />
<br />
Bertnard- Depends on the wait event<br />
<br />
Peter E- Maybe say "here are the things thinking about adding, this this and this" and if it fits in 8 bytes then maybe ok but if it doesn't then may need another idea.<br />
<br />
Andres- Suspect overhead is going to be too high and will need a completely different mechanism because collecting all that data for all wait events and most of the time it's not going to be needed and is just expensive.<br />
<br />
Stephen- Maybe pull together other information at the same time when polling rather than putting it all in the wait event<br />
<br />
Andres- yeah, maybe store buffer ID then good chance you'll be able to figure out what it is without having to store everything in to the wait event<br />
<br />
Bertnard- Sticking to 8-byte only might be ok<br />
<br />
Andres- store the buffer ID and then get the rest from shared buffers and doesn't introduce a lot of overhead by default. Some of this is maybe solved by dtrace instead possibly.<br />
<br />
Bertnard- Most of the time you have to guess when you have a waitevent but you don't know what is happening. Would like to know if it's always the same relation and see what's going on in an aggregate. If database is waiting on something but you don't know what it's waiting on then that's not as helpful.<br />
<br />
Andres- Maybe infer from other information and not try to have everything answered by data provided through wait event<br />
<br />
Alvaro- Would be good to see what the system is doing in more granular way but wait event isn't the best way to get at that and maybe there should be another way that's a completely different approach which could be turned on/off for a specific operation that doesn't cause too much overhead. idk if things like a branch testing a flag will be too expensive of a problem. Storage of the performance data storage is going to have to be something completely separate from wait event. idea of using a ring buffer to store that data instead.<br />
<br />
Andres- In that case you have to be even more careful of storing the data because the cache line where that ring buffer data is stored and you constantly are writing then it's a significant amount of system memory being used.<br />
<br />
Peter E- Probes need to be put at a different level<br />
<br />
Andres- If you want the granularity where wait events because that's why wait events are there.<br />
<br />
Alvaro- Maybe where dtrace points are<br />
<br />
Mark- Could perhaps clear things out every so often<br />
<br />
Andres- Data-dependent branches added to this to collect numbers the overhead is going to be way higher. Ring buffer idea, would have to be very small ring buffer to avoid too much overhead and have to poll very very often and that all ends up with a lot of overhead.<br />
<br />
Mark- Don't have a specific design but if you overwrite regularly a particular place<br />
<br />
Andres- but then you have a memory barrier and you get a stall if you don't have that data in L1. As soon as you do any reads it gets much more expensive and you need to do reads in the data collection path.<br />
<br />
=== Extensions & Stats ===<br />
<br />
Bertnard- Folks are working on this and want to make sure the idea is generally supported. Idea is to allow extensions to add stats into the system.<br />
<br />
Andres- Add infra to add stats at runtime. pg_stat_statements has its own storage but if we added the last bit of extensibility to the shared memory stats system then maybe wouldn't be needed.<br />
<br />
Bertnard- If you want to reset stats then maybe have it in a different file..<br />
<br />
Andres- Why?<br />
<br />
Peter E- Storage happens just at shutdown<br />
<br />
Andres- When you store it on disk need to keep track of what extension added the stats. With different files maybe you have an easier time detecting which file goes with which extensions stats.<br />
<br />
Peter E- Same issue as always, extensions have to register themselves somehow<br />
<br />
Andres- Maybe just extension name would be fine, just write to disk on shutdown, maybe a bit more space but probably not an issue really. One thing I'd like to change pg_statistic would be to make it crash safe because on crash we run into problems where vacuum doesn't do anything initially because the stats were lost. Maybe store with redo LSN on checkpoint the stats at that point in time and might be slightly dated but would otherwise be correct.<br />
<br />
Heikki- Is there a case where having old stats is worse than new stats..?<br />
<br />
Matthias- What about truncation or such?<br />
<br />
Andres- That should create a new relfilenode and that should be ok<br />
<br />
Heikki- what if stats say it doesn't need vacuum but changes since last time make it so that it does need to<br />
<br />
Andres- Today that problem already exists, this would make things at least better a bit. Could change most of the stats to include relfilenode and then use that when doing replay maybe and should solve truncate problem too. Not sure if there is semantic issue with that.<br />
<br />
Vik- If autovacuum sees 4 zeros then maybe it should select that table for analyze<br />
<br />
Andres- on stat reset then autovacuum goes crazy and that could cause an issue.<br />
<br />
Vik- Also issue with failover where we don't have stats<br />
<br />
Andres- If we add relfilenode to stats key then that would help with failover too and you could count the number of inserts and updates and such and keep that and that would be better than zero. May also be able to serialize stats into xlog during checkpoint maybe and that works maybe for inserts and updates but not for selects because those are actually different on the replicas vs. the primary.<br />
<br />
Peter E- Maybe the standby only replays the things that it trusts or which it should, but that could be messy.<br />
<br />
Alvaro- What about WAL size increase<br />
<br />
Andres- We used to write out the stats a whole bunch and wasn't an issue really. Have seen cases with really huge stats but was very exceptional case. We could do something like a summary at commit time maybe and might make reconsiliation in memory easier.<br />
<br />
Tomas- Only write out the things that changed since the last time... <br />
<br />
Andres- Could add a change counter or such to the in-memory so that we could know when we need to pass things along<br />
<br />
Heikki- Maybe add columns to pg_class to track<br />
<br />
Andres- Update so frequently that it could be a problem. Have to have some kind of per-database background worker of some kind perhaps.<br />
<br />
Tomas- When we store last vacuumed / last analyzed, maybe store at that moment..<br />
<br />
Andres- but at that point it's probably not useful, wouldn't end up triggering another vacuum because it was just run. Logging to WAL at checkpoint time is easy but if you are doing catalog updates then you have to connect to each database to update those catalogs if it's in pg_class, etc.<br />
<br />
Tomas- If we had columns in pg_class, what use-case would that solve that regular logging of stats wouldn't? If we log in WAL stats, would that give you everything you need. Benefit of pg_class would be that you wouldn't lose stats because they're WAL'd.<br />
<br />
Heikki- You could do it differently<br />
<br />
Tomas- Losing stats is a pretty common issue. Two different approaches to the problem- WAL log stats directly vs. updating pg_class. Seems like storing stats in WAL would be better. Once in a while flush modified stats to WAL.<br />
<br />
Andres- Just changing the key to the relfilenode likely would help but wouldn't deal with insert/abort but could handle that by logging more info during abort. With relfilenode we could track enough to get close enough value.<br />
<br />
Heikki- Talking 3 approaches. 1) never WAL log stats but instead calculate stats based on WAL data seen; issue: stats on primary vs. replica they could drift apart- basically dead reckoning, might get far off<br />
<br />
Tomas- How would this work? pg_stats has a lot of other stuff and so you wouldn't have background writer info or checkpointer<br />
<br />
Andres- Why do you want that from the primary when you're on the replica?<br />
<br />
Tomas- Just spit-balling, maybe there's other types of stats that do make sense<br />
<br />
Heikki- Number of stats that you'd want separate on the replica from the primary like seq scans, index scans, etc. Second proposal, instead of dead reckoning, you dump the whole stats file to the WAL on a regular basis and that could be very large which seems like an issue.<br />
<br />
Andres- The size doesn't seem that bad<br />
<br />
Stephen- Maybe store into new place on replica and pull into place on promotion<br />
<br />
Andres- Requires handling of the stats differently quite a bit possibly<br />
<br />
Heikki- 3) Put stats in pg_class directly, not everything but important things<br />
<br />
Andres- That seems like it would be very hard<br />
<br />
Peter E- Saying we only care about these stats for these specific reasons, but not other stats, which isn't great because people care about the other information. Maybe is ok but maybe not.<br />
<br />
Heikki- With any of these schemes, it seems like we would want to separate these<br />
<br />
Andres- Think we agree that trying to keep some stats after crash would be good<br />
<br />
Peter E- We can't hard-code too much stuff if we want to keep stats system extensible<br />
<br />
Andres- Extending stats comes with a bunch of things to be added but shouldn't be too hard to keep extensible even with these ideas.<br />
<br />
=== Container sets (arrays, row types, etc) ===<br />
<br />
Vik- Range type- We have a couple by default but otherwise you have to create your own range, other types get created each time. How can we have multi-type values without having to create new types to do it. What would it take to get multi-sets?<br />
<br />
Peter E- Create them on the fly<br />
<br />
Heikki- Do the same thing as row types, there's permanent row types and also dynamic row types<br />
<br />
Peter E- Want to create a multi-set field with integer and maybe create that type on the fly.<br />
<br />
Vik- Yes, but just in queries it would be nice to create multi-set without having to create a whole new type<br />
<br />
Andres- We don't necessarily need a different pg_type if we can put some encoded into typenum maybe<br />
<br />
Vik- What about nesting?<br />
<br />
Heikki- If you think of them as records, think it works<br />
<br />
Peter E- arrays of multi-sets?<br />
<br />
Andres- Is nesting necessary?<br />
<br />
Vik- Multi-sets useful, maybe not nesting but maybe, multi-set of arrays<br />
<br />
Andres- If you can get a lot of things without implementing a lot of crazy stuff then maybe it could be done.<br />
<br />
Peter E- Is it really a problem to create types?<br />
<br />
Andres- Could end up bloating pg_type a lot<br />
<br />
Peter E- Where is bloat coming from?<br />
<br />
Andres- row types in pg_class ends up adding up a lot<br />
<br />
Alvaro- Also need to consider pg_depend, pg_shdepend for owner<br />
<br />
Peter E- bloat from pg_type ends up coming from every table having entry, maybe create new base type.. Creating 5 is maybe not that bad since creating 2 already. Maybe we just say we don't support ranges on table row types.<br />
<br />
Heikki- Multi-set of record would make sense<br />
<br />
Vik- Yeah. Main issue is knowing about these things on the fly and not necessarily having to put something into pg_type<br />
<br />
Andres- Maybe copy approach from records, might not be too hard except for nesting case.<br />
<br />
=== v16 Patch Triage ===<br />
<br />
session variables, LET command -- Tomas- Will be talking to Pavel about this patch. Did a review of it, planning to commit it, biggest question is if it's really a useful feature or not. I think it is. Patch in pretty good shape. Joe- Like the feature but not sure why it has taken so long. Alvaro- Has gone through several rewrites. Jeff- Risk of running afoul of SQL standard? Tomas- Don't think there really is. Heikki- Are there concerns about what happens if it changes in the middle of a query or..? Tomas- Having session variables that's accessible instead of GUC. It's not transactional. Vik- Why not just use a table? Heikki- Seems like a temp table with only one row? Stephen- Issues with temp tables being constantly created/dropped can't use on standby, etc. Peter E- May look into the standard and see. Tomas- Maybe not good to get into details on the patch right now. Heikki- Looking at patch now.<br />
<br />
Remove self join on a unique column -- Tomas- patch seems correct but is hard to convince myself that people actually write joins like this. Stephen- Because of ORMs. Tomas- Doesn't add much overhead.<br />
<br />
Avoid hiding shared filesets in pg_ls_tmpdir (pg_ls_* functions for showing metadata ...) -- Alvaro- Need to do something here but not sure if this is the thing to do. Andres- doesn't show directories but parallel operation have directories and this wasn't updated and so semantics are not entirely clear.<br />
<br />
Make message at end-of-recovery less scary -- Andres- Idea of patch is quite useful, needs a good bit of polish based on last review. Not sure if that's changed more recently. Vik- Not just wording? Andres- No, 300-line patch and some of that is tests but is more than just wording. Currently we hide errors for example in some places and you get a useless message at the end and to fix that there are structural changes needed. Heikki- seems pretty narrow, if WAL recovery ends due to invalid length but it could end for a variety of reasons depending on if WAL recycled or not. Andres- On primary shouldn't get that and almost always zero out the page. On the standby we should but we don't zero out the pages and that is causing bugs and we should start doing that. Wrote a patch for that but some details are really hard to get right there.<br />
<br />
More scalable multixacts buffers and locking -- Andres- Not sure if there is agreement that this is a good idea because people want to move SLRUs into shared buffers and then this idea wouldn't make sense. Matthias- When is that going to happen? Is a bandaide but could help. Andres- But is a bandaid that could cause really weird performance impacts. Needs a lot of work to figure out the access patterns and such. If was just a config without massive downsides then it would be ok but it isn't that.<br />
<br />
pg_dump - read data for some options from external file -- Peter E- Don't personally like it but if someone wants to commit it then it should be fine. Stephen- Dislike having a whole new file format but whatever.<br />
<br />
CREATE INDEX CONCURRENTLY on partitioned table -- Matthias- Just like normal CONCURRENTLY but on a partitioned table. Heikki- Great feature if we can have it. Are there concerns? Andres- Not sure how the code can be correct, but maybe missing something. Opens a memory context and then calls existing concurrent code and expects snapshots to work across that but that can't really work so don't see how it could be correct... <br />
<br />
Function to log backtrace of postgres processes -- Peter E- Not sure if that's useful? Andres- Wished for this many times. Disagreements on list with this currently. Peter E- Maybe patch tries to do too much? Heikki- Every background worker has to be modified. Peter E- Probably isn't great that it requires that and maybe that's part of the issue. Andres- Does that to make it safe to use in signal handlers.. but it can't do it safely.. Whole reason it does it as shared preload library but that's not guaranteed because of how ELF works. Heikki- Any way to do it safely? Peter E- If you want to call it from a signal handler because you're stuck somewhere.. Andres- Does it really need to be called from a signal handler? Andres- If we use latch wait in more places and use that approach instead of trying to do it from signal handler then it may work. Heikki- Tom commented that surely this is unsafe to do from a signal handler. Stephen- Seems like general feeling is that this should be RWF as needing to be redone to not be trying to do this in a signal handler.<br />
<br />
pg_stat_statements and "IN" conditions -- Tomas- About normalization of the strings, variable number of values in the IN list instead of generating each entry it would normalize into smaller number. Feature seems useful where big IN list completely swamps the system. Andres- Adds a GUC? Seems unnecessary. Tomas- Can imagine cases where different numbers generate different plans, can understand why a GUC. In that case we are not differentiating between different types of queries. Peter E- The code looks very straight-forward here. Tomas- Anyone think we shouldn't have the feature or maybe we don't need to even have the GUC? Vik- Feature seems useful and we should just always enable it. Peter E- Discussion of query jumbling or if we need a switch and this might be something where we may want to have control over. Tomas- We should make the same decision between this patch and query jumbling. If it's hidden behind a GUC or internal function that says jumble one way or another.. Peter E- May be a good release to try putting this into when we're breaking things already and see what happens, if you break it, break it big. Maybe we should just do it. Andres- Anyone know why this adds a new field to struct location len for merge? Tomas- wants to track the original location to the unjumbled. Stephen- Seems like folks are generally in favor of this, maybe even without having a GUC.<br />
<br />
Fix pg_rewind race condition just after promotion -- Heikki- Completely forgot about this and am looking back through it. Just haven't gotten around to actually committing. Heikki will commit (haha, but probably).<br />
<br />
Faster pglz compression -- Tomas- Looks ready, plan to commit it. Difficult to understand but a good improvement. Heikki- Not sure about why to bother but don't see a downside. Tomas- People still do use pglz a lot, so.<br />
<br />
Parallel Hash Full Join -- Alvaro- Munro says planning to get this in shortly ... in November. Heikki- We want it. Seems to include bug fixes that should be committed?<br />
<br />
On client login event trigger -- Heikki- What have the problems been with it? Andres- If you screw it up you can never log in again which was one of the issues. At some point was work on a GUC to disable to allow you to get in.. Not sure if added but without that means no way to log into the system. Alvaro- Very wanted feature. Heikki- What do people want to do with this? Matthias- Possibly useful to set variables on log in or to log into a table that a user logged in. Peter E- Maybe wait until after event trigger disable GUC so that can bypass this if there is a bug or issue with it.<br />
<br />
Consider parallel for LATERAL subqueries having LIMIT/OFFSET -- Tomas- I may be able to take a look but difficult to make reasoning about if it's correct or not, would be good to get Tom's input, but will take a look. Alvaro- Tom said he didn't know how it could be safe. Tomas- I'll read it and maybe learn something and try to figure it out and see if it could be done.<br />
<br />
pg_stat_statements: Track statement entry timestamp -- Andres- A lot of complexity. Tomas- The idea of tracking when entry added makes sense because if you have two entries for two tables and one has very large numbers and other has very low numbers, does it mean if one is more active? Might just be because of which one is newer and not which is more active really. Makes sense. But then adds a lot of complexity by adding in a lot of ways to reset things. Concerned about some of the changes. Andres- A lot of overhead been added lately and not sure if it's good to add more. Tomas- To do reasonable analysis you need to keep the deltas anyway and so not sure that this is really helpful. Peter E- If you have a data set where you care about tracking then likely you'll have entries for years and therefore isn't really that useful. Matthias- Even latest isn't that hard to derive by checking deltas across time. Tomas- Only issue with keeping regular snapshot is that it doesn't work for min/max latency for the query because once you get a spike you'll never see the new min/max in the following period but even so not sure that it makes sense to keep entry timestamp.<br />
<br />
psql - refactor echo code -- Peter E- Added myself to review it and will do so.<br />
<br />
pg_stats and range statistics -- Tomas- Did review of this. What it does is that we don't currently track range statistics and only problem with that is how we read and print the histogram and if there is a way to do that in pure SQL or if we need special functions for it. I will continue working on it and reviewing it and hopefully will make progress.<br />
<br />
pgbench: using prepared BEGIN statement in a pipeline could cause an error -- Alvaro- problem is that we prepare the whole thing, but maybe we didn't want it to change how we are measuring latency and a different method was proposed but the author hasn't changed it accordingly. Alvaro will comment that it wasn't updated to new approach.<br />
<br />
Add system view tracking shared buffer actions -- Andres planning to commit it, issue with tablespace tests but should be able to resolve in next few days.<br />
<br />
Using each rel as both outer and inner for anti-joins -- Tomas- Will make us consider more planning options. Currently only consider one way and this could allow other ways to be considered. Andres- Turns a lot of nested loop antijoins into hash antijoins which seems good. Tomas- Seems pretty reasonable..<br />
<br />
Dynamic result sets from procedures -- Peter E- Patch held up for a long time to get the display of multiple result sets due to psql needing it. Working on adding more capabilities and tests to check the extended protocol, found some issues and that's in progress for being fixed. Not sure if this patch will land any time soon but needs more tests and is a useful feature. Funtionality is part of the standard. Heikki- Changes the protocol? Peter E- Kind of, protocol kind of just works today but maybe need to be more explicit to make sure that everything works. Have a patch to make it work with JDBC that's pretty small.<br />
<br />
Add foreign-server health checks infrastructure -- <br />
<br />
Parallelize correlated subqueries that execute within each worker -- Tomas- On list comments that it's unsafe but the discussion was side-tracked about discussion about how parameters passed to parallel workers. Not sure ... Not just about parallel subqueries or parallelism in general but also about how parameters are passed in general. Not sure what the conclusion is. Andres- Not close to being committable due to commented out warnings and seems to be WIP.<br />
<br />
postgres_fdw: commit remote (sub)transactions in parallel during pre-commit -- Andres- looks partially committed? Heikki- Not sure why this needs to be configurable? Andres- Seems to maybe have a lot of duplicated cases that shouldn't be needed between commit/abort?<br />
<br />
Update relfrozenxmin when truncating temp tables -- Andres- Every version seems to get more complicated ... <br />
<br />
functions to compute size of schemas/AMs (and maybe \dn++ and \dA++) -- Matthias- Would like more verbose options into backslash commands. Peter E- Not sure why want this ++. Maybe have + for more details but don't want to compute size every time. Stephen- Maybe have cache or stats for size of things to make them less expensive to query. Andres- Lot of work to actually keep correct answer for size in shared memory. Andres- Don't really see point and maybe just reject it.<br />
<br />
disallow HEAP_XMAX_COMMITTED and HEAP_XMAX_IS_LOCKED_ONLY -- Andres- just explicitly forbids a combination of bits that shouldn't be allowed. Mark- Mainly to pick up on corruption but not sure that it's actually not allowed to happen and pg_upgrade makes things very difficult because couldn't be 100% sure that this is an error. Seems probably right but not 100% sure. Andres- What does this actually get us though? Would it really catch corruption? Mark- Unless there is a way to prove that this really won't happen then can't commit it. Heikki- Same as having asserts to check things. Tomas- In some other cases have realized that there was corruption due to invalid bits being set and so this could be useful. Don't know how long it has been broken though. Peter E- Maybe add this to amcheck but not add an assertion as that's only in development anyway but won't help with corruption detection. Mark- idea is to add assertion to the code that matches what amcheck checks so if you decide to use such a bit pattern then would check and make sure that it gets realized that both need to be changed. Mark- Not willing to try and guarantee that this can't happen. Heikki- Maybe put it into amcheck and use that to see if it does happen in the field. Tomas- It might scare people though for no reason if it turns out to not be an issue and people who hit it might not report it. Heikki- another thing is that it seems to possibly report problem on pg_upgrade'd clusters which were valid and therefore this shouldn't go in because of that. Maybe doesn't kill the whole patch but maybe does.<br />
<br />
In-place persistence change of a relation (fast ALTER TABLE ... SET LOGGED with wal_level=minimal) -- Peter E- Seems to add a lot of code but seems not really worth it. Andres- Should have removed minimal WAL long ago. Tomas- Lots of bugs there and people don't seem interested in fixing them..? Andres- Some of them fixed but those fixes sometimes added other bugs. Tomas- Maybe make minimal WAL level improvements conditional on other things. Andres- Don't see real use-case for wal level minimal. Tomas- Wouldn't use it for important data ... Peter E- Even if we don't like wal level minimal, this is a legit point that we could optimize this, but the code is large and adds things to check what we changed, etc. Tomas- For patch author it makes sense because it can be helpful. Heikki- Patch also just changes how relation rewrite is done to just use FPIs instead of a bunch of heap inserts and that could be better. Maybe get rid of wal level minimal stuff but keep the other changes. Andres- Not sure if this is really safe to do this way though like in rollback. Tomas- Should be at least split into two patches if it actually works, a patch for the non-wal-level-minimal part and then a patch for the wal-level-minimal optimization.<br />
<br />
Speed up releasing of locks -- Matthias- Good idea. Andres- Needs a bit more work and possible small slowdowns. Removing a lot of weird code. Doable for 16 if time is put into it.<br />
<br />
Add log messages when replication slots become active and inactive -- Tomas- seems like a simple patch if we want them, which seems reasonable we can have it. Doable for 16 if we want them.<br />
<br />
Daitch-Mokotoff soundex -- Tomas- Seems like a simple patch and will take a look and probably will commit it.<br />
<br />
reduce impact of lengthy startup and checkpoint tasks -- Andres- Have serious doubts about it making things better for xid wraparound and other things. Good idea in theory but need much more pared down set of things as said on thread. Pretty large change and probably not for 16 unless a committer picks it up and spends a lot of time on it.<br />
<br />
Add Amcheck option for checking unique constraints in btree indexes -- Mark- Responded on a couple of things, author submitted new patches, waiting for Peter G to see if he wants it. Probably doable if Peter G has time to review it.<br />
<br />
pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory -- Probably can go in as a bugfix?<br />
<br />
Error "initial slot snapshot too large" in create replication slot -- Andres- still couldn't figure out how to do much better than the current state. Not sure if anything new has happened.<br />
<br />
AcquireExecutorLocks() and run-time pruning -- Tomas- Amit is working on it and getting feedback from Tom and so seems to be in progress.<br />
<br />
64-bit SLRU page numbers (independent part of 64-bit XIDs) -- Heikki- seems like a good idea but not sure about implementation. Peter E- Not filled with confidence about it. Heikki- Is this independently useful? Matthias- Good to have in before 64bit xid because it reduces the patch size. Andres- could be useful with a different AM. Matthias- Also for the whole SLRUs and so for MultiXact it might help those too. Peter E- Asked why this is helpful, may have been changed? Matthias- Preparation for 64bit xids. Peter E- Question on whole patchset seems like things get added and then reverted in the patch series and is a bit confusing. Heikki- Looking at patch, it doesn't change more things..? There's some complicated logic in dealing with wraparound and switching to 64bit numbers should help with that but this patch doesn't seem to take advantage of that. Andres- would have to make pg_upgrade quite a bit more complicated to make it work. Heikki- There is a pg_upgrade part of the patch. Andres- not likely to make it into 16.<br />
<br />
Pluggable toaster -- Andres- Don't see it going anyway and idea of content aware toasting is very complicated and patch adds a whole bunch of infra. Vik- Like the idea but is very complicated. Heikki- like the idea of making the toaster better. Matthias- For certain data types, specialized compression would be really good. Tomas- Seems like this isn't the right place to be putting this infra. Andres- want to compress json to get rid of keys, want to do it for everything not just toasted data and seems like isn't the right place to do this. Tomas- Seems like the wrong level. Want dictionary and use that for the data type and then compress and then it can be toasted like usual. Matthias- Not really a good way to make this available. Andres- This doesn't really get you much farther. Heikki- Seems actually more reasonable than thought. Two types of toasting, the compression and slicing data into tuples and putting in toast table. Matthias- This does both and tries to work with the data type to make the output more performant to access. json tuple deconstructed into multiple tuples following the structure. There have been some really compelling performance improvements using this. Heikki- Maybe be able to split this by the two different pieces. Very unlikely for 16 due to lack of consensus.<br />
<br />
Add pg_stat_session -- Peter E- Seems reasonable to make it into 16 as long as agreement about usefulness. Needs someone to look at it. Maybe move some things from pg_stat_activity to here?<br />
<br />
Allow parallel plan for referential integrity checks -- Mark- Robert marked it as unsafe because he wasn't sure if it was safe. Author doesn't seem to have time. Needs someone to pick it up or it should be closed or punted to 17.<br />
<br />
warn if GUC set to an invalid shared library -- Seems to need some cleanup? Hopefully someone can look at it, not a lot of code and could probably make it if worked on.<br />
<br />
add guc: hugepages_active -- Seems reasonable.<br />
<br />
Time-delayed logical replication subscriber -- Peter E- seems to be getting worked on, could be done in time.<br />
<br />
Add non-blocking version of PQcancel -- Heikki- Seems like a good idea in priciple. Peter E- being worked on and plausible for 16.<br />
<br />
Add LZ4 compression in pg_dump -- Tomas- seems almost ready and have been reviewing it, probably good enough, likely for 16.<br />
<br />
Move SLRU data into the regular buffer pool -- Andres- Probably not for 16 at this point. Heikki- Concern about performance. Matthias- Performance seems ok. Heikki- probably not going to make 16 just because it's quite large. Andres- Deletes more code than it adds at least.<br />
<br />
doc: PQexecParams binary handling example for REAL data type -- Peter E- being worked on, should be fine.<br />
<br />
Support logical replication of DDL commands -- Not likely to make it to 16 as it's quite large.<br />
<br />
Skip replicating the tables specified in except table option -- Alvaro- seems like it needs some work, not sure if it'll be ok for 16.<br />
<br />
Data is copied twice when specifying both child and parent table in publication -- Sounds like a bug?<br />
<br />
Perform streaming logical transactions by background workers -- Partially committed?<br />
<br />
Fix dsa_free() to re-bin segment -- bug fix?<br />
<br />
pg_rewind: warn when checkpoint hasn't happened after promotion -- Heikki- Looking, not a large patch, seems sane are probably could make it.<br />
<br />
generate_series in selected timezone, date_add in selected timezone -- no opinions<br />
<br />
New hooks in the connection path -- Bertrand will update to remove hook which seems contentious and hopefully the rest is ok to go in.<br />
<br />
Check consistency of GUC defaults between .sample.conf and pg_settings.boot_val -- Andres- Good idea but was a competing patch, not sure which way will go.<br />
<br />
nbtree performance improvements through specialization on key shape -- Matthias- Needs some cleanup. Andres- Seems like too large a patch to make it in. Matt- Ask Peter G to review it? Matthias- Not sure how to make it much better than how it is.<br />
<br />
Add sortsupport for range types and btree_gist -- Jeff- I can probably take a look and see. Not sure what state it's in.<br />
<br />
Reducing planning time when tables have many partitions -- Alvaro- Rowley has been working on it.<br />
<br />
CI and test improvements -- <br />
<br />
Transparent column encryption -- Peter E- feels like it's complete.. Want to try and get it in and make it acceptable <br />
<br />
Switching XLog source from archive to streaming when primary available -- Andres- pretty reasonable patch, haven't looked at details but having a config option for this seems reasonable and could probably go in for 16.<br />
<br />
An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication -- Andres- Not sure why this needs to be solved..? Tomas- Seems backwards from what the streaming sync replication does? We commit locally and then wait and so there shouldn't be more than one waiting. Andres- Busy loop which doesn't seem good? Not sure about this one.<br />
<br />
Minimal logical decoding on standbys -- Bertrand- A lot of activity with feedback from Robert and Andres. Andres- Good chance that at least some of it could make it into 16.<br />
<br />
Compression dictionaries for JSONB -- Alvaro- Related to toasting patch? Heikki- Why just do this for jsonb? Matthias- Specifically implemented for jsonb but should make it possible for others too? Don't think it will make 16 because don't have bandwidth and not a lot of others interested. Tomas- Think it actually does the infra. Main problem I had with the patch is I tried to measure the benefit to show improvement and had trouble seeing consistent improvements. Not sure if that was my problem but we need to decide if we want to do this or pluggable toaster or what. Andres- Uses typmod seems like a no-go for this? Tomas- data type specific / column type specific compression, we need context to identify the dictionary, patch is using typmod for that which doesn't seem good. Heikki- seems like would belong better in the toaster. Alvaro- maybe have a half-hour discussion with the devs around these things. Andres- Seems to require the dictionary be specified which doesn't seem good. Tomas- Just initial implementation, in future would be a process to handle doing that. Not likely for 16 just because these questions need to be figured out and discussed more. These are ok in POC but not going to be good enough to go in yet. Andres- Maybe dictionary go into pg_attribute or other context. Does not seem likely for 16.<br />
<br />
ALTER TABLE SET ACCESS METHOD on partitioned tables -- Seems small enough and useful enough that could make it for 16.<br />
<br />
Add SPLIT PARTITION/MERGE PARTITIONS commands -- Alvaro- Definitely want this but not sure we are going to be able to make it for 16.<br />
<br />
Fix assertion failure with barriers in parallel hash join -- Bug fix?<br />
<br />
Support load balancing in libpq -- Andres- Not a very large patch. Tomas- Reasonable and may be able to make it in for 16.<br />
<br />
Add JIT deform_counter -- <br />
<br />
Amcheck verification of GiST and GIN -- <br />
<br />
Use fadvise in wal replay -- Andres- reject it. Tomas- Whole assumption is readahead is disabled, but if readahead is enabled then this is always worse. Nothing to solve here really.<br />
<br />
Let libpq reject unexpected authentication requests -- Andres- doesn't address issue with peer, at least. Solve some problems maybe. Need to be clear in the documentation what it is actually doing. Could possibly make 16.<br />
<br />
Support % wildcard in extension upgrade scripts -- Andres- Think this was pretty much rejected?<br />
<br />
Fix recovery conflict SIGUSR1 handling -- bug fix<br />
<br />
pg_visibility's pg_check_visible() yields false positive when working in parallel with autovacuum -- bug fix<br />
<br />
Add 64-bit XIDs into PostgreSQL 16 -- Not gonna make it for 16.<br />
<br />
Eliminating SPI from RI triggers -- Alvaro- Seems not likely to happen due to people being too busy with other things. Tomas- Was updated though? Maybe.<br />
<br />
Add initdb option to initialize cluster with non-standard xid/mxid/mxoff. -- For testing 64bit patch but could be useful for other things. Mainly for testing.<br />
<br />
Testing autovacuum wraparound -- Andres- Not planning on working on it really because we lack infra to do it without problems.<br />
<br />
Improve dead tuple storage for lazy vacuum -- Andres- Making progress, not sure it'll be ready.<br />
<br />
USAGE privilege on PUBLICATION --<br />
<br />
explain analyze rows=%.0f --<br />
<br />
Fix alter subscription concurrency errors --<br />
<br />
ALTER TABLE and CLUSTER fail to use a BulkInsertState for toast tables --<br />
<br />
Cygwin cleanup --<br />
<br />
logical decoding and replication of sequences, take 2 --<br />
<br />
doc: mention CREATE+ATTACH PARTITION as an alternative to CREATE..PARTITION OF --<br />
<br />
Add index scan progress to pg_stat_progress_vacuum --<br />
<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2008_Developer_Meeting&diff=37570PgCon 2008 Developer Meeting2023-02-10T08:43:37Z<p>Alvherre: </p>
<hr />
<div>A meeting of the most active PostgreSQL developers and senior figures from PostgreSQL-developer-sponsoring companies is being planned for Wednesday 21st May, 2008 near the University of Ottawa, prior to pgCon 2008. In order to keep the numbers manageable, this meeting is '''by invitation only'''. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
This is a PostgreSQL Community event, sponsored by EnterpriseDB.<br />
<br />
== Time & Location ==<br />
<br />
The meeting will start at<br />
10AM, and will finish at 5PM, or earlier if we run out of things to<br />
discuss! The location is:<br />
<br />
Arc the Hotel<br />
140 Slater Street<br />
Ottawa<br />
ON K1P 5H6<br />
<br />
(613) 238-2888<br />
<br />
[http://www.arcthehotel.com/ Arc the Hotel]<br />
<br />
[http://maps.google.ca/maps?f=q&hl=en&geocode=&q=Arc+Hotel,+Ottawa&ie=UTF8&ll=45.420474,-75.696992&spn=0.004842,0.010074&z=17&iwloc=A Google Maps]<br />
<br />
== Attendees ==<br />
<br />
The following people are currently expected to attend the meeting. If you have been invited and intend to be there, please add your name to the list:<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus<br />
* Neil Conway<br />
* Jeff Davis<br />
* Pavan Deolasee<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* David Fetter<br />
* Magnus Hagander<br />
* Alvaro Herrera<br />
* Tatsuo Ishii<br />
* Marko Kreen<br />
* Tom Lane<br />
* Heikki Linnakangas<br />
* Denis Lussier<br />
* Michael Meskes<br />
* Bruce Momjian<br />
* Dave Page<br />
* Simon Riggs<br />
* Jignesh K. Shah<br />
* Teodor Sigaev<br />
* Greg Stark<br />
* Andrew Sullivan<br />
* Koichi Suzuki<br />
* Itagaki Takahiro<br />
<br />
== Agenda items ==<br />
<br />
The following agenda items are being proposed. Please add items and comments below so we can develop a useful agenda.<br />
<br />
* Review of commit-fests<br />
* Patch management<br />
* Buildfarm<br />
* Performance regression monitoring<br />
* Partitioning roadmap<br />
* Using multiple CPUs per query<br />
* Vacuum roadmap, including but not limited to<br />
** use cases we've improved recently versus ones which remain problems<br />
** autovacuum<br />
** improving HeapTupleSatisfiesVacuum<br />
* General strategy for dealing with platform-specific performance tweaks<br />
* Build system - gmake vs. VC++ vs. cmake & friends<br />
* Steps toward SQL/MED<br />
* Configurable TOAST compression<br />
* How to deal with sponsored features (e.g. the materialized views proposal)<br />
<br />
== pgCon 2008 Developer Meeting Minutes ==<br />
<br />
10:15 to 17:00 May 21, 2008<br />
<br />
=== Introductions ===<br />
<br />
Andrew Sullivan introduces himself, thanks Dave Page & EnterpriseDB. In oder to get through the agenda, if we get into a contentious discussion, then we'll identify that, take it to a list, and move on. Before we start, we'll introduce ourselves.<br />
<br />
* Jan Wieck, Core Team, Afilias, focussing on COPY, commit enhancements.<br />
* Takahiro Itagaki, NTT, developed spread checkpoint, working on stability & usability.<br />
* Koichi Suzuki, NTT Open Source, synchronous replication.<br />
* Alvaro Herrera, Command Prompt, improving VACUUM.<br />
* Tom Lane, Core Team, Red Hat, trying to to clean up the planner for 8.4<br />
* Heikki Linnagas, Dead Space Map<br />
* Magnus Hagander, Windows work, pgAdmin<br />
* Oleg Bartunov, U Moscow, Indexing, GIN & GiST<br />
* Teodor Sigaev, same as Oleg.<br />
* Robert Lor, Sun Microsystems, DTrace monitoring.<br />
* Pavan Deolasse, EnterpriseDB, HOT, working on performance stuff.<br />
* Marko Kreen, Skype, Skytools & replication.<br />
* Jeff Davis, working on external sorting.<br />
* Bruce Momjian, Core Team & EnterpriseDB<br />
* Peter Eisentraut, Core Team, Credativ, Community process<br />
* Jignesh Shah, Sun Microsystems, TPCE<br />
* David Fetter, SQL/MED<br />
* Greg Stark, EnterpriseDB, working on unfinished patches<br />
* Dave Page, Core Team & EnterpriseDB, pgAdmin & Windows<br />
* Josh Berkus, Sun Microsystems, minutes, postgresql.conf<br />
* Tatsuo Ishii, SRA OSS, working on recursive join<br />
* Andrew Dunstan, consultant, buildfarm author, working on LISTEN/NOTIFY<br />
* Simon Riggs, 2nd Quadrant, working on MERGE, HOT Standby<br />
* Michael Meskes, Credativ, ECPG, promoting PG, recursive join<br />
<br />
Josh is taking notes, but we could use a backup person.<br />
<br />
=== Review of Commit-Fests ===<br />
<br />
Bruce is very happy with the way the commit-fest is going. The first one didn't work very well. However, the 2nd one went fairly smoothly, easy to see what was going on, the wiki worked well. The commit fest is finished now -- Magnus just needs to finish one patch.<br />
<br />
Is overwriting other people's wiki changes a problem? People aren't sure. However, in general, the wiki is good enough. One problem was that people were continuing development work and distracting from commit-fest. We could put a footer in the e-mail.<br />
<br />
Open items for May commit fest. We need a forcing function for closing a fest ... should we have a two week limit. Marko: no we need to look at all patches. Currently blocking on not looking at the patches. Need to go after reviewers. Greg and Simon said that they need to be assigned stuff because they don't necessarily know what to take. Josh suggests a daily status e-mail, and that Tom / Bruce need to nag people. Andrew S. suggests that we need to make it clear that we can have reviewers who are not committers. Simon suggests that there's a difference between people who will read code and people who won't; the commit-fest should be the second level of review.<br />
<br />
Perhaps we should have a list of reviewers or obscure patches. In the IETF, they've started taking people's names -- we should have a call for volunteers. Bruce did quite a bit of nagging in the 1st commit fest, but it didn't help. The 2nd one Bruce didn't do much and it went better. Bruce didn't see a major problem ... thinks it will get better on its own. What prevents people from reviewing? We don't know. We used to have a lot of patch review with people jumping on it on their own. Patch volume has also gone way up. <br />
<br />
Daily status e-mail not popular -- shouldn't do it. What about the volunteer idea? Need a manager for each commit-fest. Josh will happily manage the July commit-fest.<br />
<br />
Bruce: one more issue for the commit fest. What do we do with the items which were rejected? Need to have a list of patches which are almost done. We also might decided not to say "rejected". Someone needs to make the process ok with the submitters. The rejection e-mail comments really matter -- don't say "this patch is crap." We want to divide into "returned for more work" and "rejected". What do we do with patches that "need more work", though? Do NOT put them automatically on the next list. <br />
<br />
Maybe we should have a "limbo page". Tracker? No tracker discussion. Josh suggests that we just contact the author, then appeal to patches. Bruce says the TODO list supplies better organization. Josh will try his strategy for July. Also, replace "Claimed By" with "Reviewers". But what if you want to "Claim" something? Put a comment.<br />
<br />
Should we have more than four commit fests? That's tentative. We won't know how it comes out until 8.4. <br />
<br />
=== Community Mailing List Management ===<br />
<br />
What about merging hackers and patches? Seems to be generally OK. Some concerns about attachments. Putting patches into the wiki seems problematic. General discussion about merging various mailing lists. SQL & Interfaces, Genearal and Admin, Performance & other stuff. We should come up with a set of lists to be retired or consolidate. Should discuss this online later -- Peter E. will follow up.<br />
<br />
=== How to Deal with Sponsored Features ===<br />
<br />
A week ago, someone offered to pay a hacker to work on materialized views. PeterE talked to him, and apparently others contacted him in private. People think this is a non-problem. There's an understanding problem on the part of companies; we need a technical spec sponsorship first. 95% of people who go away with a workaround.<br />
<br />
Maybe we should document the process of adding a new feature to PostgreSQL. David F volunteered to document this. <br />
<br />
Do we have any way to collectivize funds? We can use SPI. Simon doesn't want all sponsored development to be public, there's a lot of low-level stuff that he doesn't want to argue about. There's also some bureaucracy with non-profits. Tom thinks we can do it privately, Josh cites GiST, OpenOffice as counter-examples. Greg says there are two questions, how hackers should get involved, and second the mechanics.<br />
<br />
Andrew points out that the management of software development always gets underestimated. Maybe we should have a registry of PostgreSQL coders available for freelance hacking.<br />
<br />
Koichi points out the issue around developers guarenteeing that patches would go in. Also, some projects just turn out to be a bad idea. And we need to pay for technical specs. Also, sometimes individual developers need a company to be an intermediary. So you just need to pay a commission. Also, funneling it through SPI could be a political problem and there could be community fights. <br />
<br />
=== BuildFarm & Performance Regression Testing ===<br />
<br />
Nobody knows who put this on the agenda (it was dave). Sun will contribute to the buildfarm. <br />
<br />
Tom would like an easier view into the history. It's possible to get notifications on the buildfarm? Yes, there are mailing lists.<br />
<br />
Is the buildfarm the right platform for performance regression testings? Josh suggested that we do small tests on the BF machines, and we will have a community-owned benchmark rig in Portland. EnterpriseDB has DBT2 rigs. Sun has some internal stuff. <br />
<br />
Simon thinks the BF would test basic operation peformance to see that we don't mess up low-level operations. Josh wants a list of operations we want to test. Simon wants dedicated equipment. We need a testing framework for performance regression. Some people want complex tests that run all day. But there are simpler tests we can run which are fast. Josh says that most BF machines are not dedicated, and Tom cites need for a simple test which developers can run.<br />
<br />
Some kinds of testing ... caching algorithms, database maintenance, etc. require running tests around the clock and won't work for this. We need one lightweight test which tests fast low-level things. Then we need some heavier tests that run really long on a few dedicated machines. We also need people who will test for performance on particular patches. Jignesh wants to run a timed series of tests to show different versions and how those are doing.<br />
<br />
Can we build stuff into the current regression test suite? Sure, but would need to be optional. Maybe pgBench would be a better basis. Size of test would depend on which machine you're running this on. <br />
<br />
We have three performance items here:<br />
* Request for big 36-hour tests for major work. <br />
* Request for medium-sized test of up to 20 hours a day for dedicated testing machines.<br />
* And then a small 1-hour low-level test for buildfarm members etc. <br />
<br />
Heikki will lead small test thing. Big testing is already happening. We need to cover both whether we're moving forwards, and whether we're moving backwards. The buildfarm has made regression issues much more managable, we need to the same for performance. EDB does not have such a think for EDBAS. We will need software for running tests and need people to analyze the results on a regular basis. Building software will slow down this.<br />
<br />
Sun has open sourced a test harness (Faban). We also have pgUnitTest. Do we have historical versions of pgBench? Probably doesn't matter, pgBench isn't very good. Maybe we need pgBench2. Maybe we need something else. We also need to have detailed output. The rest should go to a mailing list.<br />
<br />
=== Partitioning Roadmap ===<br />
<br />
We have a patch which is waiting on coming up with a roadmap. No clear consensus on what we need. Simon created the patch based on fixing some severe problems in the field. Stopped because of the amount of planner changes required, syntax changes won't fix it.<br />
<br />
What problems are we trying to solve?<br />
<br />
* Can't rewrite queries with a partitioned tables, get really bad plans. <br />
* Query plan caching<br />
* Managing paritions -- "building out of spare parts"<br />
* Triggers for updates and inserts<br />
<br />
As a consolation, other databases (MySQL, Oracle) have had to rewrite paritioning several times. Jignesh talked about paritioning in the storage layer. Josh pointed out that people like being able to do DDL on individual partitions. Alvaro: should we still have parititions in pg_class? Folks think yes. Also, Tom thinks we need to improve this incrementally.<br />
<br />
Simon says that there are two different cases: completely automated and very detailed and complex. That we probably can't keep very complex and make it easier to manage and more efficient. Maybe we should not use inherited tables. Andrew thinks that people like the current flexibility because they're working around, not because they really like it. <br />
<br />
We should start new partitioning as a new feature. Some features of inheritance are not useful for partitioning (multiple inheritance, etc.). But we'll want to have partitions accessable individually for some purposes. Or do we? And maybe right now we're not getting the simple case users. More discussion about feature.<br />
<br />
Will we will start implementing a new, separate partitioning feature not based on inheritance. Or subclassing inheritance? Are we willing to have only range partitioning, or do we want hash partitioning? Range partitioning seems to be enough. <br />
<br />
Will we allow parititioning on a expression index? Lists of columns? Multi-dimensional data types? Maybe we should use opclass. The method should be programmable so it support spatial in the future. But date range partitioning is the biggest use case.<br />
<br />
How big of a problem is the DDL scripts etc. Simon thinks not as big as the planner issues, because you can work around them. Two different groups of users with two different primary problems: DDL for web/simple apps, for real DW, the plans. <br />
<br />
What would the syntax look like? Bruce describes some complex syntax. Jignesh describes the DB2 syntax. Autocreation of partitions is problematic. Josh says people want "PARTITION ON (<expr>)". Tom points out that people want "partition this existing 6TB table". Also, the problem with <expr> is how do we match that to a query range. <br />
<br />
More detailed discussion. Take to mailing list.<br />
<br />
=== Multi-CPU Queries ===<br />
<br />
Lots of people are interested in parallel query. Also parallel COPY and pg_dump.<br />
<br />
We really haven't excelled at having a single query max out resources. The next couple of years we'll need to focus on this because the proprietary databases do better than we do in this. Informix could do this. Jan outlined an idea from a few years ago, in which the executor breaks out nodes. <br />
<br />
For parallizing pg_dump, Simon has been working on it. Their idea is snapshot cloning, where multiple sessions can share the same snapshot XID. We need to write to multiple files, and then have pg_restore do restore from multiple files. There are some issues with locking.<br />
<br />
Even multi-threading the executor will bring up a lot of the same issues as multi-threading the whole db. Libraries, dependencies, etc. Simon says we can use processes. But there's a lot of overhead for processes. Some queries are not parallelizable. <br />
<br />
Jignesh brought up asych I/O. Maybe that's easier than parallel query. Greg says that Oracle parallel query is easier to use than parallel server. Is that the same topic? Jignesh thinks so. Points out that splitting up the executor has a penalty for OLTP. But are servers actually CPU-bound? Several people say yes. Simon says that 2-4 times scalability with parallel query. Bigger than that is very hard. <br />
<br />
We'd also have to make it configurable. We really want both asych and parallel query. Asynch is more useful for OLTP. A lot of discussion on different approaches ensued. <br />
<br />
We can do parallel pg_dump and pg_restore for 8.4. And maybe some kind of index/scan readahead. Parallel query is longer term. Who's working on it.<br />
<br />
=== Platform-Specific Optimization ===<br />
<br />
People keep coming in with platform-specific optimizations for PostgreSQL. Generally these show huge improvements, but are specific to an OS-HW combinations, and aren't very tunable. This means that we have to re-write these things. <br />
<br />
But doing some things completely independant can actually be '''more''' code. So are we at a stage where we need to do platform-specific optimizations. Sometimes we've already had to, like semaphores. Asynch I/O may require this. Posix_fadvise works differently from Linux AIO. Jan mentioned "write barriers".<br />
<br />
Other examples: in explain analyze, would be good to have DTrace output to find out how many I/Os each operation did. Why can't we just do that using the current generic approach. <br />
<br />
We maybe can't come up with a general strategy. Or policy. <br />
<br />
(some discussion missed)<br />
<br />
Direct I/O is another OS-specific issue. But not sure how to resolve. We need it for really large shared buffers, but it's completely different for each OS. This is a good example of such an issue ... we'd need a two completely different checkpointing code paths. <br />
<br />
=== Vacuum Roadmap ===<br />
<br />
From years ago, vacuum is the most hated part of PostgreSQL. All of the little fixes (FSM, autovacuum, etc.) may be "putting lipstick on a pig". Maybe we need to handle stuff fundamentally differently. The alternative is to go to an undo system. The idea is to make maintenance happen offline. Free Space Map, Dead Space Map, Dirty Space Map. Currently vacuum has to visit every page. <br />
<br />
So can we do one massive rewrite, or should we continue fixing it piecemeal? Greg thinks piecemeal. Pavan suggested some other fixes. Jignesh mentions the lack of predictibility in seq scans. In general, everyone thinks that fixing individual problems is the approach.<br />
<br />
More issues we need to address:<br />
<br />
* Scanning whole index regardless of how many changes<br />
* Hint bits in tuple headers<br />
* Dead Space Map should address very large tables with few dirty pages. But does not fix very large indexes. So we need to do a lookup. Discussion about how to lookup index pages ensued. A serious issue for GIN<br />
* Index rescanning also prevents Synch Scan from helping vacuum. As well as vacuum_delay.<br />
* Update to a large portion or the entire table. Smaller segmentation? Maybe not.<br />
* Constant load databases ... vacuum in memory? Problem is that rows may still be visible.<br />
* Hint Bit setting. Maybe the bgwriter can set the hint bits.<br />
* Long-running transactions: Alvaro thinks we can fix that by having those get a snapshot rather than just an XID. People really liked this.<br />
<br />
Greg says there are use cases we've covered and ones we haven't covered. He thinks chipping away at the use cases is the right approach. It's like chipping away at a block of stone.<br />
<br />
More ideas: reference counting. Having to rescan the whole index seems to be one of the biggest problems. When we UPDATE a whole table, maybe re-write that? Maybe only if the lock the table? Marko did concurrent CLUSTER which wasn't completed. Could we have segments which are different sizes? That would break CTIDs. But Tom thinks that's fixable.<br />
<br />
Simon says we need more detail on real-world use cases. Greg Stark will follow-up on the use cases.<br />
<br />
=== Build System ===<br />
<br />
Do we want to use something different? cmake vs. VC++ etc.?<br />
<br />
The current VC++ port for Windows builds breaks all the time. <br />
<br />
The current makefiles contain a lot of copy and paste, which Peter wants to refactor. He expects us to be able to use cmake in the future if we want to. That would allow us to more easily produce OS-native make files. Jan thinks this sounds like the old imake, which is dead for good reasons. Peter disagrees. KDE has changed to use cmake and it's working for them. cmake use is growing.<br />
<br />
In general people were positive to the idea. There was a lot of wondering about specifics. Like, what does it depend on? It's in C++. Seems to be OK.<br />
<br />
Peter will get started on this soon, with Magnus' help.<br />
<br />
=== SQL/MED ===<br />
<br />
David Fetter: looking for a way to let Postgres talk directly to other DBMSes and data stores. Using middleware is very insatisfactory. Neil Conway wrote a patch which surfaced qualifiers for RULES. It would be nice to have qualifiers in userland as a good-enough for doing this. Already fixed DBlink for this.<br />
<br />
The patch surfaces the WHERE clause as a string currently. Heikki doesn't think that this is a long-term solution. EnterpriseDB implemented this for Oracle. Doesn't do joins either. <br />
<br />
Another issue is estimating rows from the remote query. <br />
<br />
People would rather have a more robust implementation. But is anyone working on it? Marko says this would also help PL/proxy. Jan proposed something more sophisticated using cursors where the remote qualifiers would open a cursor. But we also want the more brute-force interface.<br />
<br />
This is a more general issue about stored-procedure based views.<br />
<br />
Are there any other use cases for this? Marko thinks there is one for local access, such as ones for dynamic query generation. Simon and Heikki talked about pushing down aggregates. Simon doesn't see any way around parsing a plain text qualifier to pass down conditions. Heikki says this needs to also be handled by a plugin which does interpretation. <br />
<br />
So, do we accept the Conway patch the way it is now? Seems ok to do the stop-gap. <br />
<br />
Jan thinks that showing the node tree will work better. But others don't agree with him -- it wouldn't work for PL/perlU. But Jan thinks it would work to give it a pointer to the parse tree and the range, we'd need to add an access function for the PL.<br />
<br />
=== TOAST ===<br />
<br />
Alvaro: Postgres uses an lc algorithm, but some people want to use other compression algorithms. How does it compare? If it compares poorly, we may want to switch to lz0. However, lz0 has GPL code and doesn't really fix the problem.<br />
<br />
The real use case is storing big text documents. RIght now they're working around it with BYTEA, but that's not satisfactory. Maybe we just offer a bzip contrib module and they can use updatable views. But pluggable compression stuff would open up other new uses. <br />
<br />
Jan thinks that we need to have some handles to fine-tune TOAST per column. Or maybe even according to estimated compression gain. Some discussion about TOAST headers and different ways to do this ensued.<br />
<br />
Should just deal with this through the hackers list. We need to see some performance numbers. Jan told a story about originally implementing TOAST.<br />
<br />
Jan also suggest having a separate data type for each compression algorithm.<br />
<br />
Next step is to see some numbers.<br />
<br />
== End Remarks ==<br />
<br />
Bruce: Wow, there's 26 of us in this room. Most everybody in PostgreSQL development is here. I'm excited. A lot of people travelled a huge distance. Some people are volunteers and made the time to be at the conference. It's really great that we've been able to talk on a technical level for 7 hours and make a lot of progress. Many of you have been working on Postgres for 7+ years, mostly as a volunteer. As long as we can keep that kind of dedication up, our potential is unlimited. A lot of stuff about the project hasn't changed, the culture, what drives us. We're really going to change the world, or at least our portion of it. Being associated with the PostgreSQL project has really been a highlight of my life. We'll be associating with a lot of other users over the next couple of days, but this is the core. This is what drives us forward.<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2009_Developer_Meeting&diff=37569PgCon 2009 Developer Meeting2023-02-10T08:43:36Z<p>Alvherre: </p>
<hr />
<div>A meeting of the most active PostgreSQL developers and senior figures from PostgreSQL-developer-sponsoring companies is being planned for Wednesday 20th May, 2009 near the University of Ottawa, prior to pgCon 2009. In order to keep the numbers manageable, this meeting is '''by invitation only'''. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
This is a PostgreSQL Community event, sponsored by EnterpriseDB.<br />
<br />
== Time & Location ==<br />
<br />
The meeting will start at 9:30AM, and will finish at 5PM, or earlier if we run out of things to discuss! Tea and coffee will be available from around 9AM, and food and drink will also be served during morning and afternoon breaks and at lunchtime.<br />
<br />
The meeting will be held in the [http://www.novotelottawa.com/meetingsbanquets/meetingsbanquets.shtm Red Experience Room] at the Novotel Hotel, which is located at:<br />
<br />
33 Nicholas Street<br />
Ottawa<br />
Ontario<br />
K1N 9M7<br />
Phone: (613) 230-3033<br />
<br />
You can use [http://maps.google.com/maps?q=Novotel+Ottawa&btnG=Search&sll=45.352088,-75.723440&sspn=0.028048,0.057850&t=null&hl=en&cid=45352088,-75723440,4970248458606736768&li=lmd Google Maps] for directions if required.<br />
<br />
== Invitees ==<br />
<br />
The following people have RSVPed to the meeting:<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus<br />
* Joe Conway<br />
* Selena Deckelmann<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* David Fetter<br />
* Dimitri Fontaine<br />
* Stephen Frost<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Jonah Harris<br />
* Zdenek Kotala<br />
* Marko Kreen<br />
* Tom Lane<br />
* Denis Lussier<br />
* Michael Meskes<br />
* Bruce Momjian<br />
* Dave Page<br />
* Teodor Sigaev<br />
* Greg Smith<br />
* Greg Stark<br />
* Joshua Tolley<br />
* Robert Treat<br />
<br />
In addition, we will be joined via conference phone by Koichi Suzuki, Itagaki Takahiro and Toru Shimogaki from NTT who are not able to join us in person.<br />
<br />
== Agenda ==<br />
<br />
The following agenda will be used for the meeting:<br />
<br />
* 09:30 - Introductions<br />
* 09:40 - Source code management (David)<br />
* 10:10 - Coffee<br />
* 10:25 - Release management<br />
** Distributing release management (David)<br />
** [[Proposal for alpha releases]] (Peter)<br />
* 11:05 - Synchronous + Hot Standby, completion plans<br />
* 11:35 - Ways to improve commitfests (Josh)<br />
* 12:05 - The remaining Big Adoption Issues for PostgreSQL (Josh)<br />
* 12:35 - Auto-configure (Greg, Josh)<br />
* 13:05 - Lunch<br />
* 13:45 - Modules/plugins packaging, upgrading (Dimitri)<br />
* 14:15 - [[Parallel Query Execution]] - spread queries on multiple CPUs<br />
* 14:45 - More comprehensive testing (Peter)<br />
* 15:15 - Tea<br />
* 15:30 - Participation in SQL standards committee (Peter)<br />
* 15:50 - State of PgFoundry (Peter)<br />
* 16:20 - Upgrade-in-place plans (short, we have a session on this in the main program)<br />
* 16:40 - Any other business<br />
<br />
== Minutes ==<br />
<br />
= PostgreSQL Developer Meeting May 20, 2009 =<br />
<br />
Attending: Tom Lane, Michael Meskes, Dimitri Fontaine, Josh Tolley, Oleg Bartunov, Teodor Sigaev, Zdenek Kotala, Peter Eisentraut, Selena Deckelman, Magnus Hagander, David Fetter, Stephen Frost, Greg Smith, Greg Stark, Dave Page, Robert Haas, Robert Treat, Joe Conway, Andrew Dunstan, Denis Lussier, Bruce Momjian, Jonah Harris, Marko Kreen<br />
<br />
Attending via Skype: Koichi Suzuki, Shimogaki-san, Itagaki-san, Simon Riggs, Greg Sabino Mullane<br />
<br />
== Source Code Management ==<br />
<br />
System worked for us so far. But David wants to change it. Most people at developer meeting are using Git. David thinks we should use Git for 8.5. What's wrong with using the Git mirror? Problems include: CVS-to-Git breaks; patching is harder for committers. Changing to Git will not increase committers. Peter did a survey of committers, and 12 out of 15 didn't want to switch. We don't want to change the process of having patches vetted before they're applied. SCM doesn't matter for this. Moving to Git might change the perception on working on branches.<br />
<br />
Continuing issues with Git Mirror. Mirror is currently broken due to rsync issue. Translators could make use of Git, but there are other issues. Message translation is also a bottleneck, wierd scripts. Git mirror would be good for this. We can fix the mirror script, but could have issues in the future.<br />
<br />
Git for Windows works now. No longer an issue.<br />
<br />
One problem is that Git does not produce context diffs without add-on scripting. Source review for distributed review is different. Don't read the patch, you load into tree. But some people will expect context diffs. We might want to ask Git for integrated support. Need to solve that issue before moving over. Since most Git users use its browsing tools rather than using the diff directly, maybe if we used Git we wouldn't want context diffs.<br />
<br />
We can pull trees back to 7.4 and it works, but we need to test builds for this. Git has tags. Are we going to lose people who don't have Git? What about other platforms, including ones that might not have the full required toolchain to build git?<br />
<br />
How would the buildfarm work? Need to investigate if it'll work and how hard it'll be to change. Greg Sabino Mullane wrote a patch already. We could choose it through a config option. If we used Git and GitHub, we could test experimental patches through repository. OmniTi did this with the Dtrace probes.<br />
<br />
Using Git would reduce bitrot. And you can't tell committer modifications to submitted patches. And Git would help get work done more quickly. Finding reviewers who will use CVS may be difficult in the future. Young people won't use CVS. Greg Smith says that Truviso has switched to Git and it's much better, especially at reducing bitrot. Summer of Code students worked much better with distributed code management.<br />
<br />
Actions:<br />
{{TodoSubsection}}<br />
{{TodoItem|Check on context diffs & Git}}<br />
{{TodoItem|Change buildfarm scripts, client and server side, to add git as an SCM (Andrew Dunstan)}}<br />
{{TodoItem|Check whether all the buildfarm machines can be made to work directly with Gi (Andrew Dunstan)|If not, some sort of CVS emulator (such as [http://www.kernel.org/pub/software/scm/git/docs/git-cvsserver.html git-cvsserver]) will need to be setup.}}<br />
{{TodoItemDone|Fix GitMirror script (Magnus)}}<br />
{{TodoItem|Confirm past releases can be built identically from Git, using binary diff}}<br />
{{TodoItem|Make decision on whether we'll change now.}}<br />
{{TodoItem|Announce decision to move to allow core/committers/contributors opportunity to really learn the tool before the switch}}<br />
{{TodoEndSubsection}}<br />
<br />
== Alpha Releases ==<br />
<br />
The idea is to get back into the "release early, release often" mode. Do release tarballs after each commitfest. Check if buildfarm is green, wrap a tarball, make an announcement, package it. Packagers are already doing this in an unapproved way.<br />
<br />
Where are we going to get release notes for this? (open item) If you want to get early testing, we need to let people know what's in the release. We could just have a list of the commits and edit it down. If we made up release notes more frequently, it might be easier at the end. But that hasn't worked in the past. Bruce went over process of how we generate the release notes. We can just have a big blog of all the features or on the wiki, maybe just point people at the Commitfest page.<br />
<br />
What about users who use Alphas in production? We will get users who deploy alphas. That's their fault. Maybe we do a cat version bump after every alpha. No, that makes it harder. <br />
<br />
What would we name them? Date stamp vs. version number. Tie the alphas to the commitfest names. Vote to do Alpha1, alpha2, etc.<br />
<br />
psql should say "alpha". Do we want people to really test this? Some features are known broken at the end of a commitfest. Some people may think it's alpha-beta-final, and it's not. What else would we call it though? Maybe we should call it "CF". Or "bikeshed". But RPM (and probably other packaging systems) needs ASCII/numeric sorting N-V-R, which means it has to be before "beta".<br />
<br />
Conclusion: it's 8.5alpha1, 8.5alpha2, etc.<br />
<br />
We want everyone using the same snapshot for testing as common reference point, and the alpha proposal does this. Would be announced only on postgresql.org, other news feeds will grab from there. And the explanation of the alpha process needs to be in the announcement.<br />
<br />
Actions:<br />
<br />
{{TodoSubsection}}<br />
{{TodoItemDone|Document alpha release process: [[Alpha release process]]}}<br />
{{TodoEndSubsection}}<br />
<br />
== Synchronous & Hot Standby ==<br />
<br />
Simon plans to complete hot standby for 8.5. But won't be at the very first commitfest. Currently Simon doesn't have time to complete the project immediately -- won't work on it for at least 6 weeks. Synchronous replication still in development. Dealing with each comment on Synch, Koichi believes that these will all get fixed, but needs review and debug.<br />
<br />
Peter asked for debug/review of Synch Replication. Can we get something for the first alpha now? Can we break it up into smaller pieces so that we can test it easier?<br />
<br />
Doing both projects at once is problematic. Code is actually pretty separate. Trying to tackle both at once is too difficult. Choosing an order would help. Maybe Synchronous replication should be first. We need reviewers for each thing, it doesn't make sense to have target dates for things. Professional development schedules affect Simon's availability. Robert Haas is happy to read the patch. Heikki will stay on hot standby. Josh asked Bruce to be responsible for Synch Rep. Peter's not sure what his schedule is. <br />
<br />
What about using fundraising to pay for review? Would be possible. Some consultants are committers. Andrew doesn't feel competent. Ask Alvaro? Andrew G.?<br />
<br />
We need to look at the features and prioritize what will be provided in the patch. <br />
<br />
Actions<br />
{{TodoSubsection}}<br />
{{TodoItem|Get a firm responsible committer for synch replication}}<br />
{{TodoItem|Confirm with Heikki}}<br />
{{TodoItem|Determine expected CF for patches.}}<br />
{{TodoEndSubsection}}<br />
<br />
== CommitFests ==<br />
<br />
Commitfests have been good but not perfect. Need improvements in a number of areas. For one the tools are horrible. Possible to extend existing tools like Request Tracker to do what we need? Mediawiki extension for rt?<br />
<br />
Robert H: frustrating part was the bad patches (cleanup issues, doesn't apply, etc.). One way to improve things is to find a procedure which is real reviewers review serious patches. Lower tolerance for useless patches. Would like software which does this. Github/buildfarm might help. Josh thinks triage using volunteers is the best way to do this. A lot comes down to tools, which Robert is working on. We could also use improvements in the archives.<br />
<br />
Last commitfest should be for "previously seen" patches only. Late patches will get bounced to the next version. We should make exceptions. CM needs to have authority to bounce stuff without argument.<br />
<br />
Simon: won't we just be transferring the pain from the last commitfest to the penultimate commitfest horrible? What about just accepting a long integration phase? Don't want long integration because it's a halt to development. We need to have largest patches early, and smaller ones later. <br />
<br />
Small patches are not a problem. Easy to approve or reject. Tom made the mistake of committing the easy patches first during November fest, leaving little for less-experienced people to do. Won't do that again.<br />
<br />
Peter doesn't care about most patches. That's what the RRR handles. There are other things which are not that interesting. People need to explain this on -hackers. We're too restrictive about rejecting of patches because stuff isn't interesting (Robert H). We need more information about each patch. Webform? No, we need to discuss it on the mailing list. Add request for test cases, justification to guide for reviewers. Need a guide for submitters which includes all of this stuff. Need to include documentation, test case, etc., or reject patch.<br />
<br />
Original purpose of CF was to get more prompt feedback. Now we're trying to spread out patch submissions, too, so not all big patches land at final fest.<br />
<br />
Patches could also use short names as unique ids. Or maybe not ... there's patch drift. Each new post should include link to original submission e-mail. Should look at ReviewBoard again, has improved a lot in last few months. <br />
<br />
Actions:<br />
{{TodoSubsection}}<br />
{{TodoItem|Work on tools}}<br />
{{TodoItem|Check ReviewBoard}}<br />
{{TodoItem|Improve [[Submitting a Patch]] (Greg Smith, Stephen)}}<br />
{{TodoItem|Fix archives?}}<br />
{{TodoItem|Change policy on accepting to last commitfest.}}<br />
{{TodoEndSubsection}}<br />
<br />
== Top Adoption Issues ==<br />
* Installers: pretty much fixed. Could use better documentation on Windows for troubleshooting, where the Windows installer puts files and how to run tools via the command line common issues.<br />
* Simple low-overhead replication. WANT!<br />
* Upgrade in place (below)<br />
* Admin tools need love, both monitoring and managing large numbers of installations (needed to expand PostgreSQL use for big [[Shared Database Hosting]] providers)<br />
* Driver quality needs some love. Official list of drivers in the [http://www.postgresql.org/download/products/2 Software Catalogue]<br />
** Perl & Ruby are good<br />
** MSFT and [[JDBC|JDBC4]] need *lots* of love.<br />
** [[JDBC]] performance needs *lots* of love.<br />
** Driver developers & maintainers need recognition.<br />
** ODBC needs love.<br />
** Python has no entries in the catalog...but there are multiple driver projects with no clear leader.<br />
* Into the future...<br />
** Module add-ons. We don't help yet (see "Modules" below).<br />
** Per-column locale/collation<br />
<br />
Simon has also heard lots of requests for [[Simon Riggs' Development Projects|VLDB features]]. Regarding PostGIS, we mainly need better module support for installing stuff like PostGIS. One-click installer takes care of some of this. Also there's RPM/Deb packaging. Linux packagers would like a list of which things to build. But staying up to date is very hard. <br />
<br />
Synchronous replication & Hot Standby are major gating factors, if we don't have these the PostgreSQL community may stop growing (Josh). Some discussion of existing replication features. If we put them in first CF we would start alpha program with a bang. Are we willing to commit stuff which is known not-complete? Why not just use Git and have a Hot Standby tree? That's what it's for. We need to find out what status is. Needs to be in in 1st or 2nd commitfest.<br />
<br />
Actions:<br />
{{TodoSubsection}}<br />
{{TodoItem|Blog the Top Priorites}}<br />
{{TodoEndSubsection}}<br />
<br />
== Auto-Tuning ==<br />
<br />
Greg Smith: some progress towards getting autotuning. Simple config script right now to output bare minimum config options. Works for some configuration settings, fairly conservative. Some bugs, but works OK for 8.4. The [[GUCS Overhaul|initial spec]] included converting between the various ways we see this information. Some people want heavily commented config file. Some think we should have a minimal config file. Some people think that we should have something in between. No consensus ever expected there, trying to reach one a waste of time.<br />
<br />
Tool to convert into brief form, annotated form, etc. Current tool just comments stuff out and adds settings to beginning. Dump of pg_settings for 8.4 included with pg_tune tool so that we can generate stuff without database being up. <br />
<br />
Two development priorities: (1) re-write as C tool. (2) Would like a standard comment form for PostgreSQL.conf. What comments are always in, vs. user comments. Have a different delimiter. Can't do this because there are no standards for comments. Also it's better to have separate user and auto-generated .conf files. Many newbies hate complex config file but sysadmins like it.<br />
<br />
Selena: might be more effective to deliver recipes to config management systems. Josh: recipes are too complicated, Greg: people use bad/ancient recipes all the time. We should have two separate files. Apache is good example with local.cf includes structure.<br />
<br />
(3) Issue: upgrading configuration files. That hurts upgrades. Also need config-test utility. (4) also need to generate a sysctl.conf. OSX is permanently broken.<br />
<br />
Sample is commented out and is included in main (automatic) configuration file. Or maybe use a directory. Subdirectory which we parse in alphabetical order. Directory scanning would also help modules install.<br />
<br />
What about putting it in the database? Don't even bring it up!<br />
<br />
Actions:<br />
{{TodoSubsection}}<br />
{{TodoItem|Implement Directories (Magnus, Greg Smith)}}<br />
{{TodoItem|Finish pg_tune (Greg Smith)}}<br />
{{TodoItem|Work out details on hackers list. (Greg/Magnus/Josh)}}<br />
{{TodoEndSubsection}}<br />
<br />
== Lunch ==<br />
<br />
We ate lunch. It was good.<br />
<br />
== Modules ==<br />
<br />
The first part for module design is user spec. Started [[Module Manager|working on it on the wiki]]. When modules are installed, they shouldn't get dumped, but re-installed for new versions. So we need to have objects belong to modules. Tables are special problem. We need a lot of design work in terms of what we want -- there are too many conflicting purposes right now.<br />
<br />
There is nothing in the SQL standard about this. What do we call it? Module, Package, names are taken. How about Extensions? Plug-ins?<br />
<br />
A lot of stuff hinges on Ownership. You need a concept of objects belonging to a module, and then we can build a lot of other functionality. <br />
<br />
The version upgrade problem is critical. PostGIS has special syntax for populating the PostGIS tables. We need concepts of objects which don't get pgdumped. But module data in module tables is a special problem. PostGIS has special tables with auto-created stuff. Maybe we need to run post-and-pre install scripts that handle specific pg_dump/pg_restore requirements. We need a way to deal with it. PostGIS one example hard case, tsearch2 another.<br />
<br />
Do we need special schemas? No. Dependency would solve this. But we could have special schema. But path is broken. <br />
<br />
Where do releases go so that people can install them? Maybe we should just have a file spec. Downloading stuff is completely separate, tools like Python's easy_install don't care how you got the bundle. We also want to support Linux packaging systems for this. We can be flexible about this as long as bundles match a directory spec. How do you make sure that compiled modules work with various platforms? <br />
<br />
How would dependencies work? Lots of discussion.<br />
<br />
Do we want to have module permissions? People seem to think no. Filesystems don't do this. But PG is not a filesystem. Maybe we should just do this with schema. Or we just fix GRANT to wildcards.<br />
<br />
Actions:<br />
{{TodoSubsection}}<br />
{{TodoItem|Finish design spec for "modules"}}<br />
{{TodoItem|Pick new name}}<br />
{{TodoItem|Figure out how to deal with internal module data}}<br />
{{TodoEndSubsection}}<br />
<br />
== Parallel Query Execution ==<br />
<br />
Zdenek has student, Master's, who is working on [[Parallel Query Execution]], he is starting by trying to thread Postgres. But the Postgres backend isn't thread safe, this can't be done. Early POSTGRES had prototype parallel query, multiprocess. Zdenek wants to use threads, but there are other ways. The big win cases for parallel execution is for long-running queries, and there the difference in overhead doesn't matter. <br />
<br />
Zdenek will discuss with student using a multi-process model. Tom claims that a thread-based module will never run. Simon agrees.<br />
<br />
Two concepts of parallel query: executing query nodes in parallel, or splitting a single node (i.e. workers each handle a portion of a sequential scan). This is for parallel query on a single machine. You pretty much know how many workers you'll need, about 1/(num cores). What about overallocation of workers? Actually, that's an opportunity for resource management, rebalance workers whenever a new query is added. Original POSTGRES had some code for parallel query. The system can know what resources it has available. Simon had done some work on this. "chunk out" work.<br />
<br />
Sharing snapshot clones is needed for parallel dump, too, Andrew looking into it. Current clone code from Simon works with 8.1. Need to update it to 8.4. The code is written as an extension rather than a backend patch. The current way it works though is not expected to be committed, and changes to snapshots since then allow us to do same thing using shared memory.<br />
<br />
Actions:<br />
{{TodoSubsection}}<br />
{{TodoItem|Zdenek to talk with grad student}}<br />
{{TodoItem|Jonah to help Zdenek find the POSTGRES stuff.}}<br />
{{TodoEndSubsection}}<br />
<br />
== Comprehensive Testing ==<br />
<br />
Peter has done work on test coverage. We have a bunch of code which is not tested in any organized way. How do you do patch review without tests, and we probably have bugs we don't know. We need a lot more test cases.<br />
<br />
Problems are that huge numbers of tests will take too long to run. So divide them up into chunks and run just one chunk. Also maintenance is an issue. Some testing stuff is very difficult, like testing recovery. But simple tests should be simple. Greg Smith says frameworks/harnesses don't really help us that much. We don't have a way to test big issues like "does pg_restore work?".<br />
<br />
Question is do people know testing better than Peter? Greg already running more complicated tests.<br />
<br />
Performance regression testing. Greg has a tool for running pgbench automatically. It saves all results in a database and it's possible to chart them over time for this purpose, just need to add new reports. Code is not public yet, Greg will post. Are there benchmarks we can just use? Nothing is complete/easy/fast. We need to do development.<br />
<br />
Simon: we don't have optimizer testing. Tom: we need something to test really complex cases.<br />
<br />
We need a framework to save tests over time so that we're building up a suite of test cases instead of throwing them away.<br />
<br />
We can write tests in Perl if we want. And then we can use Perl-based frameworks. Performance tests may need to be "margin of error". We already have a build dependency on Perl.<br />
<br />
Will need test suite for in-place upgrade, Bruce already using regression test database for testing pg_migrator.<br />
<br />
Performance tests may need to be separate from issue tests. Also, don't introduce tests that fail regularly, too much pressure to comment them out and forget about them.<br />
<br />
We would also like to be able to run other stuff on the buildfarm, like specific unit tests about platforms. Maybe not performance, but testing for portability issues. Flag stuff in the buildfarm for other tests. First priority is to get the buildfarm working with Git. After that maybe some of the performance regression tests.<br />
<br />
Also we need like driver tests and things. And Slony, and other stuff.<br />
<br />
We are testing contrib.<br />
<br />
Don't want to start separate mailing list. Do it on hackers or the issue will die. Maybe move it later when the effort takes off.<br />
<br />
Anyone used [http://blogs.sun.com/tm/entry/hudson_a_a_tool_for Hudson]? [http://www.cmake.org/ CMake] lacks documentation.<br />
<br />
Actions:<br />
{{TodoSubsection}}<br />
{{TodoItemDone|Peter & Greg Smith to talk.}}<br />
{{TodoItemDone|Post pgbench-tools code. (Greg Smith)}}<br />
{{TodoItem|Research existing OSS test frameworks. (Greg, Peter)}}<br />
{{TodoEndSubsection}}<br />
<br />
== SQL Standards Committee ==<br />
<br />
Working at Sun, got Sun/MySQL seat on the SQL Committee. Have to pay $1000 to sign up, then pay $400 a year. Have to name an individual delegate and backup delegate. Every other month is phone conference. You have to attend one before you vote, and if you miss too many you get dropped. There are in-person meetings twice a year, but you're not required to attend.<br />
<br />
This would give us access to the drafts and whether or not we're still working on various parts.<br />
<br />
So, who's interested? Peter, Stephen Frost, David Fetter, Robert Treat, Greg Stark, Josh Berkus.<br />
<br />
Josh talked some about the TPC, and TPC membership and what's involved. But TPC is probably too expensive right now. Discussion ensued.<br />
<br />
Actions:<br />
{{TodoSubsection}}<br />
{{TodoItemDone|Peter to get full details.}}<br />
{{TodoItemDone|Peter to propose to funds group.}}<br />
{{TodoItem|Organize committee. }}<br />
{{TodoEndSubsection}}<br />
<br />
== pgFoundry ==<br />
<br />
We set up new hardware, and Gsmet was going to help us migrate to it. But he disappeared. There's a new open source version called FusionForge. <br />
<br />
pgFoundry has some deviations from the mainstream version, like FreeBSD fixes. Infrastructure team has no idea how it works, done in a rush. The changes aren't that great. <br />
<br />
Now we have githost too which is more hosting. <br />
<br />
Robert Treat thinks we should use a bunch of external hosting, sourceforge, Google, etc. But we have existing projects and community stuff already on there. <br />
<br />
We shouldn't kill it off before we have the design spec for Extensions. pgFoundry is also important for helping people find Postgres stuff. But most of our most popular projects don't use it. But what about smaller projects? Sourceforge doesn't place any restrictions.<br />
<br />
What about packagers using the FTP mirrors? Stackbuilder is using it.<br />
<br />
How much time are we spending on it? Some fixing because of breaking down once a month. And we're spending about an hour a week admin-ing it, but there's a lot of things we don't do. Josh went over the disaster of the original pgFoundry deployment.<br />
<br />
There's also [http://vhffs.org/ VHFFS] as alternative to pgFoundry. But all docs are in French. (It's being improved upon it seems, installation guides & FAQs are available in english, technical docs are in FR only).<br />
<br />
What about putting together something as a replacement? We don't have the manpower. How about migrating to a new machine. What about killing projects. Not relevant to this issue. <br />
<br />
The real problem is that nobody wants to work on it.<br />
<br />
What about moving it to Linux? Well the whole infrastructure is on FreeBSD. We don't have a way to manage it on Linux. <br />
<br />
Plan: do clean, good install on the new machine and just move the database, the mailing lists and the cvs and html files. <br />
<br />
{{TodoSubsection}}<br />
{{TodoItem|Peter to join GForge team.}}<br />
{{TodoEndSubsection}}<br />
<br />
[Update: Gsmet came back and is doing the upgrade to FusionForge.]<br />
<br />
== Upgrade-in-Place ==<br />
<br />
Bruce is working on pg_migrator for 8.3-->8.4. Two major issues we need to solve: storage upgrade, catalog upgrade. There are several methods how to do that; we need to make a decision on how to do it.<br />
<br />
It's a different project from other features. Other features are "done", but UIP requires everyone who submits a patch to do something about UIP for every patch which changes a catalog or disk structure. It's how Illustra did it. We need standard code for that.<br />
<br />
Should we ship pg_migrator with the core code? Not this release, maybe next one.<br />
<br />
pg_migrator has several issues: (1) you need the old version of the server, (2) storage and files and protecting TOAST, (3) pg_dump dumps the DDL commands and you lose information (that's being fixed). Zdenek has catalog upgrade prototype which does not require old version to be around. Upgrade the pg_upgrade tablespace and most stuff into the new tablespace. What does this solve though? Well, not having binaries around from the old version, also it preserves the tablespaces. <br />
<br />
But there is maintenance for Zdenek's version because you need to add migrations for each system catalogs. About 50 system catalog changes per version, so over 5 years this would be 250 migrations to maintain. How about adding version numbers to catalog entries instead? Some issues/limitations. Getting pg_migrator working is good but there are still holes. And internally pg_migrator is fairly complex. But storing up delta changes is unreasonable for stuff like ROLE support changes ... the transformations would be awful. <br />
<br />
What's our commitment to pg_migrator? Will we support alphas? pg_migrator pg_dumps catalogs and transforms them. <br />
<br />
Let's look at the heap and the index. You have the heap format and the bitmaps. If you're looking at the page format. We'd need to read and change the bits. Currently pg_migrator links stuff over. We'd have to copy it for heap page stuff.<br />
<br />
If we're changing the pages, we'd have to look at the pages in copy mode or rewriting the pages. For datatype changes we create a new column in a schema, then copy stuff over into a new OID. Are datatype changes going to require the developer to make the change? For large databases, reindexing is not feasible. Reindexing is often 80% of migration time on big systems (Josh).<br />
<br />
Maybe we'll just have some versions we can't upgrade. Zdenek says that we should guarantee upgradability forever, other projects have done it. Fight about features which are important enough to justify breaking upgrade-in-place or not; how are we going to make that decision. <br />
<br />
What did we do for 8.2->8.3. We changed the numeric format, var-varlena, HOT. And Phantom Command ID. <br />
<br />
3 Methods for UIP: convert-on-read (diagram). The big problem there is TOAST data. Write the converted page back as you read it. 2nd method is Read Old, Write New. Zdenek has prototype of this. 3rd Method is Modularized AM. The whole relation is in one format or another and is read through a converter. For indexes we can just treat stuff as a separate index method. But we can convert heaps with update on read and do this for indexes.<br />
<br />
Side issue ... would like to be able to move tables in binary form between systems. <br />
<br />
Tabled for tommorrow's session.<br />
<br />
== Other Business ==<br />
<br />
Josh brought up issue with negative experiences for new hackers. It's a bit of an ordeal for a lot of new people. How could we improve it. Ubuntu put up a code of behavior for community folks. On our lists, people give really honest blunt feedback. But it's not good for new people. It's not really good to get too emotional even for people who can take it. Reported that students are afraid to post their patches. <br />
<br />
Andrew says take a breath and step back before getting emotional. Or take it off-list. But has to be generally change of behavior for everyone. And use reasons rather than just facts. Our FAQ also needs cleaning. And "search the archives" isn't very helpful. We need to be nice all the time so it's a habit.<br />
<br />
Action:<br />
{{TodoSubsection}}<br />
{{TodoItem|Look at Ubuntu Code of Conduct}}<br />
{{TodoItem|Be nice}}<br />
{{TodoEndSubsection}}<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PGCon2009JapanClusterDeveloperMeeting&diff=37568PGCon2009JapanClusterDeveloperMeeting2023-02-10T08:43:27Z<p>Alvherre: </p>
<hr />
<div>== Participants' Input to the Meeting ==<br />
<br />
This is the first PostgreSQL cluster developers meeting. The meeting is held as an associated event of PostgreSQL Conference 2009 Japan. For the detail of the meeting, please visit [http://www.postgresql.jp/events/pgcon09j/e/dev_mtg Call for Participants].<br />
<br />
Participants of the meeting is encouraged to submit inputs to the meeting to this page. Organizational information can be found here: [[PostgreSQL_Conference_2009_Japan]].<br />
<br />
== Links to Information about Clustering Solutions ==<br />
<br />
Please put links here to software, background material, presentations, and other information about your particular clustering software. <br />
<br />
=== Notes On General Design ===<br />
<br />
Both for this page and for your 5-minute presentation, please try to answer the following questions about your current solution:<br />
<br />
* What is the primary use-case your solution is designed for?<br />
* General Design<br />
** What's the general clustering architecture of your software? (e.g. statement replication, shared memory, GCS, clustered table, etc.)<br />
** Does it consist of a collection of tools or a monolithic control-and-management architecture?<br />
** Does your solution supply administration and monitoring tools?<br />
** Is your solution intended to be generic, or does it require the user's application to be built around the clustering architecture?<br />
** Does the solution require patches on core PostgreSQL, or is it entirely external?<br />
* Availability: how does your software deal with uptime availablity?<br />
** Is failover automated?<br />
** Is data synchronous between nodes or asynchrnous?<br />
** What failure conditions can it protect against, and which can't it?<br />
** Does it work over a WAN or do clustered machines have to be in the same data center?<br />
* Scalability: how does your software help with horizontal scaling?<br />
** How does it scale for reads, if at all?<br />
** How does it scale for writes, if at all?<br />
** Is it more designed to scale for many small queries, a few large queries, or for geographic distrubution?<br />
** What kinds of non-simple-query operations (reporting queries, stored procedures, triggers, etc.) can it handle, and which can it not?<br />
* Status: what's the project's current development and adoption status?<br />
** Is it still under active development?<br />
** How mature is it? prototype / beta / first release / second release / in maintenance<br />
** How widely adopted is it? Is it used only by customers of the developers, or by PostgreSQL users in general?<br />
<br />
=== Links ===<br />
<br />
*Some theory:<br />
**[http://www.cs.purdue.edu/homes/bb/cs542-05Spr/ParallelDBMS.ppt Basics about DB clustering] by Tamer Özsu & Patrick Valduriez<br />
<br />
*DB Cluster softwares:<br />
**Slony-I: [http://www.slony.info/documentation/ Documentation]<br />
**pgpool-II: [http://pgpool.projects.postgresql.org/ Home Page]<br />
**pgbouncer: [http://pgbouncer.projects.postgresql.org/ Project Home Page]<br />
**PL/Proxy: [http://pgfoundry.org/projects/plproxy/ PGFoundry Home Page]<br />
**PgCluster: [http://pgfoundry.org/projects/pgcluster/ PGFoundry Home Page]<br />
**Postgres-R: [http://www.postgres-r.org/ Project Home Page] (see esp. the [http://www.postgres-r.org/downloads/concept.pdf concept document] and the [http://www.postgres-r.org/documentation/references references] for related scientific papers)<br />
**PostgresForest: [http://www.nttdata.co.jp/services/postgresforest/ Home Page in Japanese]<br />
**Bucardo: [http://bucardo.org/wiki/Main_Page Project Home Page]<br />
**GridSQL: [http://www.enterprisedb.com/community/projects/gridsql.do#ui-tabs-57 Architecture Page]<br />
**Postgres-2: [http://wiki.postgresql.org/wiki/Postgres-2 Postgres-2 Introduction Page]<br />
**Streaming Replication: [http://wiki.postgresql.org/wiki/Streaming_Replication Project Home Page]<br />
**Mammoth Replicator: [https://projects.commandprompt.com/public/replicator Project Home Page]<br />
**rubyrep: [http://www.rubyrep.org Project Home Page]<br />
<br />
== Clustering Marketplace ==<br />
<br />
What do you think current commercial and user demand for clustering is? Are users trying to get scalability, availability, or other benefits from Clustering? Has interest in clustering waned or grown?<br />
<br />
Josh Says: I've seen the desire for an "Oracle RAC Replacement" is less prominent, at least in the USA, than it was a few years ago. It's possible that folks are realizing that RAC has a lot of drawbacks, or it may just be that I don't talk to Oracle users as much. People seem to be looking more for clustering to help with horizontal scalability, especially to help with performance on cloud hosting platforms, which is a big source of demand now. With MySQL in trouble, people are really looking for an "approximate consistency, low administration" solution to replace MySQL Replication & NDB.<br />
<br />
== Challenges and Issues ==<br />
<br />
What challenges are you currently facing in working on clustering and replication? What things do you think should be different about core Postgres or developed in common?<br />
<br />
== Agenda For Day ==<br />
<br />
Please contribute to this agenda! It is not yet final, but we do need to have a list of items people want to talk about before the meeting itself. If you add an item to the agenda, please put your name next to it so we know who to call to start the item.<br />
<br />
The meeting will run from 9:30AM to 5:30PM.<br />
<br />
=== Introduction ===<br />
<br />
Current clustering definitions, use cases and customer goals (1/2 hour short presentation, Koichi-san and Josh Berkus)<br />
<br />
# User goals<br />
# Specific use cases<br />
# Current market for PostgreSQL clustering<br />
# Competitive market of other DBMSes<br />
<br />
[[ClusteringUseCases|Notes from Josh Berkus' presentation]]<br />
<br />
* NTT's Keynote: [[Media:NTT Proposal091119.pdf]]<br />
<br />
=== Review of Existing Projects ===<br />
<br />
Each project team will be welcome to give a 5-minute presentation about<br />
your current development work on your clustering or replication solution<br />
near the beginning of the event. Note that you are NOT obligated to<br />
give this presentation; if you feel that your current efforts are well<br />
enough known, or if you have no time to prepare, you may choose not to<br />
give a presentation.<br />
<br />
In order to have a productive day, please design your presentation around the following:<br />
* Presentations will be 5 minutes only, strictly timed.<br />
* In order to support (1), presentations will be given on *my* laptop using PDF slides. Please bring a PDF with you to the meeting, or (better) e-mail it to me before the meeting.<br />
* Please discuss *current* work on your software and challenges you are currently facing. Summaries of the history or features of your solution are unnecessary unless they have changed in the last year. Instead, link to these on the wiki.<br />
<br />
Each team which is giving a presentation should sign up below:<br />
<br />
* [[Image:ClusterDeveloperMeeting_-_PostgresForest.pdf]] Postgres Forest status (Satoshi Nagayasu)<br />
* [[Image:Bucardo_in_Five_Minutes.pdf]] Bucardo (Selena Deckelmann)<br />
* [[Image:Pgcluster4CDM.pdf]] PgCluster update<br />
* [[Image:Postgres-R.pdf]] Postgres-R: Flashlight (Markus Wanner)<br />
* Streaming Replication: [[Media:SR ClusterSummit.pdf]] (Fujii)<br />
* [[Image:Postgres-2_Write-Scalable_Cluster.pdf]] Postgres-2<br />
* [[Image:Gridsql_jpug2009a.pdf]] GridSQL<br />
<br />
=== Future Requirements and Expectations ===<br />
<br />
Discussion: please add any discussion items you have around the future of database clustering, the demand for it, and user needs around clustering:<br />
<br />
# Common issues to several products<br />
#* Usability<br />
#* Administration<br />
# Application or industry specific issues<br />
<br />
[[ClusterFeatures]]<br />
<br />
=== Technical Issues in clustering design ===<br />
<br />
Please add any items you have around specific technical issues in clustering design, especially unsolved or recently solved ones:<br />
<br />
# Challenges<br />
#* High Availability<br />
#* Scalability (read/write)<br />
# Specifications and APIs<br />
<br />
=== Plans for future development ===<br />
<br />
Please add any items you want to discuss around future development, especially development involving a collaboration between teams or with the general PostgreSQL community.<br />
<br />
# To be developed in core PostgreSQL<br />
#* ReplicationHooks -- where did they go?<br />
#* Standby/replication (sync/async)/partitioning<br />
#* Transaction Management<br />
#* 2PC callback functions?<br />
#* APIs and interfaces ([[ClusterFeatures]])<br />
#* Tools<br />
#* (MW) common unit and/or regression testing harness?<br />
#* (MW) common benchmark framework?<br />
#* libpq improvements (keepalive / query timeout, full duplex)<br />
<br />
# To be developed separately<br />
# Merging clustering projects/products?<br />
<br />
=== Visibility to the market ===<br />
<br />
* provide information to general users<br />
* How can we make things visible to non-development people<br />
* Make agreed matrix to describe each product<br />
* How it is measured<br />
* Info Pages / Videos / Howtos (needed later on)<br />
* still need something that's an introductory material<br />
* PostgreSQL Manual - but not simple to find which to use<br />
* See letspostgres.jp -> focus on practical information<br />
<br />
* Documentation sprint<br />
** DBAs --> core developers there to help document<br />
** Availability -> <br />
** Send a DBA to do this: from all the different groups<br />
** NTT wants to offer resources<br />
<br />
* How to implement specific solutions<br />
** Use cases<br />
** two cases: technical implementer, AND their boss to convince<br />
** Updated clustering survey presentation (video)<br />
** Webinar<br />
<br />
* Packaging<br />
** Because some projects don't have them, they don't look officially supported<br />
** E.g. stackbuilder and one-click installers<br />
** Clustering packages<br />
* External module docs<br />
<br />
=== Follow-up ===<br />
<br />
* [http://it.toolbox.com/blogs/database-soup/collaborating-on-clustering-35456?rss=1 Summary of session]<br />
* [[Clustering]] portal page<br />
<br />
=== Schedule and Map ===<br />
<br />
Meeting Schedule and the map from stations near by will be found in [[Media:Schedule_and_Map.pdf]].<br />
<br />
== Contact Information ==<br />
<br />
Communication for the clustering summit has been on Josh Berkus's clustering@berkus.org mailing list.<br />
<br />
Phone numbers for Koichi Suzuki, Michael Paquier and Josh Berkus have been sent by e-mail. Note that many/most foreign cell phones do not work in Japan.<br />
<br />
Please list below your name and the hotel you are staying at in case we need to find you:<br />
<br />
* Josh Berkus: Shiba Park Hotel<br />
* Bruce Momjian: Park Hotel Tokyo<br />
* Jan Wieck: Park Hotel Tokyo<br />
* Markus Wanner: Shiba Park Hotel<br />
<br />
== Miscellaneous Travel Tips ==<br />
<br />
* Suica Card: Foreign Passport Holders can purchase a Suica + NEX package at Narita Airport for 3500 Yen. It consists of a one-way fare NEX to Tokyo and a 1500 Yen precharged Suica chipcard, that can be used on underground trains in Tokyo (plus 500 Yen deposit for the card). Considering that the one-way fare of NEX is 3000 Yen alone, that looks like a great deal. See http://www.japan-guide.com/e/e2359_002.html for details. -- Jan<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Replication]]<br />
[[Category:Clustering]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2010_Developer_Meeting&diff=37567PgCon 2010 Developer Meeting2023-02-10T08:43:23Z<p>Alvherre: </p>
<hr />
<div>A meeting of the most active PostgreSQL developers and senior figures from PostgreSQL-developer-sponsoring companies is being planned for Wednesday 19th May, 2010 near the University of Ottawa, prior to pgCon 2010. In order to keep the numbers manageable, this meeting is '''by invitation only'''. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
This is a PostgreSQL Community event. Room and lunch sponsored by EnterpriseDB. Other companies sponsored attendance for their developers.<br />
<br />
== Actions ==<br />
<br />
* Josh & Selena will '''[[PostgreSQL_9.0_Open_Items|track open items]]''' and make sure they get listed or tracked and resolved.<br />
* Selena to '''get reviewers start on commitfest early - around June 15'''<br />
* '''Branch''' on July 1, CF July 15.<br />
* '''Announce a plan''' for next development schedule.<br />
* '''No more branching''' for alphas.<br />
* Stephen's '''intern to develop PerformanceFarm application'''. Will need help from Dunstan/Drake etc.<br />
* Kaigai, Stephen, Smith, etc. to get together at pgCon and hash out some more security provider issues.<br />
* Magnus to set up git environment emulator.<br />
* Andrew to publish checklist of how to set up your Git<br />
* '''Move to Git August 17-20''': Magnus, Haas, Dunstan. Frost will be out.<br />
* Koichi to '''extract patch from PostgresXC for snapshot cloning''' and submit.<br />
* Koichi to '''come up with proposed patch design for XID feed'''<br />
* Develop '''specification for commit sequence / LSN data'''<br />
* '''[[DDL Triggers]] Wiki page to be updated''' with spec by Jan, Greg M, et al<br />
* Dimitri to do '''patch (regarding extensions..)''' More detail?<br />
* EDB '''to decide on opening code''' or not for SQL/MED<br />
* '''Review Itagaki's git repo code''': Heikki, Peter SQL/MED<br />
* '''Itagaki to keep working on API''' -- what about Peter? SQL/MED<br />
* '''Document what the plan is to do a conversion upgrade''' (Greg Smith) -- pg_update<br />
* '''Copy Zdenek's code''' (Greg Smith) related to pg_update<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be from 9AM to 5PM, and will be in the O'Connor room at:<br />
<br />
Arc The Hotel<br />
140 Slater Street<br />
Ottawa<br />
Ontario<br />
K1P 5H6<br />
<br />
[http://maps.google.ca/maps?f=q&source=s_q&hl=en&geocode=&q=ARC+THE.HOTEL+|+140++Slater+Street,+Ottawa,+Ontario,+K1P+5H6&sll=49.891235,-97.15369&sspn=45.043582,78.486328&ie=UTF8&hq=ARC+THE.HOTEL+|&hnear=140+Slater+St,+Ottawa,+ON&z=16&iwloc=A Google Maps]<br />
<br />
Food and drink will be provided throughout the day.<br />
<br />
== Invitees ==<br />
<br />
The following people have RSVPed to the meeting:<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus<br />
* Joe Conway<br />
* Jeff Davis<br />
* Selena Deckelmann<br />
* Andrew Dunstan<br />
* David Fetter<br />
* Dimitri Fontaine<br />
* Marc Fournier<br />
* Stephen Frost<br />
* Magnus Hagander<br />
* Robert Haas<br />
* Tatsuo Ishii<br />
* Takahiro Itagaki<br />
* KaiGai Kohei<br />
* Marko Kreen<br />
* Tom Lane<br />
* Heikki Linnakangas<br />
* Michael Meskes<br />
* Bruce Momjian<br />
* Dave Page<br />
* Teodor Sigaev<br />
* Greg Sabino Mullane<br />
* Greg Smith<br />
* Greg Stark<br />
* Koichi Suzuki<br />
* Joshua Tolley<br />
* Robert Treat<br />
* Jan Wieck<br />
<br />
== Agenda ==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00<br />
|Tea, coffee upon arrival <br />
|<br />
|-<br />
|09:15 - 09:25<br />
|Welcome and introductions<br />
|Dave Page<br />
|-<br />
|09:25 - 09:45<br />
|Review of the 9.0 development process <br />
|Dave Page<br />
|-<br />
|09:45 - 10:35<br />
|Development Priorities for 9.1: General discussion <br />
|Josh Berkus<br />
|-<br />
|10:35 - 10:45<br />
|9.1 Development timeline<br />
|Robert Treat<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:45 - 11:00<br />
|Coffee break<br />
|<br />
|-<br />
|11:00 - 11:15<br />
|Performance QA/Performance Farm planning update<br />
|Greg Smith<br />
|-<br />
|11:15 - 11:50<br />
|Advanced access control features [[:Image:Pgcon2010-dev-security.pdf|(Slides)]]<br />
* Steps to support [[ESP|external security providers]]<br />
* [[RLS#Issues|Issues]] of row-level access control<br />
|KaiGai Kohei<br />
|-<br />
|11:50 - 12:30<br />
|CVS to GIT: The finale?<br />
|Dave Page<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:30 - 13:30<br />
|Lunch <br />
|<br />
|-<br />
|13:30 - 13:55<br />
|[[ClusterFeatures#Export_snapshots_to_other_sessions|Snapshot Cloning]]<br />
|Koichi Suzuki<br />
|-<br />
|13:55 - 14:20<br />
|[[ClusterFeatures#XID_feed|XID feed for clones]]<br />
|Koichi Suzuki<br />
|-<br />
|14:20 - 14:45<br />
|[[ClusterFeatures#Modification_trigger_into_core_.2F_Generalized_Data_Queue|General Modification Queue]]<br />
|Itagaki Takahiro, Jan Wieck, Marko Kreen<br />
|-<br />
|14:45 - 15:10<br />
|[[ClusterFeatures#DDL_Triggers|DDL "triggers"]]<br />
|Jan Wieck<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:10 - 15:25<br />
|Tea break<br />
|<br />
|-<br />
|15:25 - 15:55<br />
|[[SQL/MED]] [[:Image:Pgcon2010-dev-sqlmed.pdf|(Slides from Heikki)]]<br />
* Including: [[ClusterFeatures#Function_scan_push-down|function scan push-down]]<br />
|Itagaki Takahiro<br />
|-<br />
|15:55 - 16:20<br />
|Status report on Modules [[:Image:Pgcon2010-dev-extensions.pdf|(Slides)]]<br />
|Dimitri Fontaine<br />
|-<br />
|16:20 - 16:45<br />
|In-place upgrade with pg_migrator progress<br />
|Bruce Momjian<br />
|-<br />
|16:45 - 17:00<br />
|Any other business<br />
|Dave Page<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
| <br />
|}<br />
<br />
== Minutes ==<br />
<br />
= 2010 PostgreSQL Developer Meeting =<br />
<br />
Ottawa, Canada<br />
<br />
Present: Tatsuo Ishii, Andrew Dunstan, Bruce Momjian, David Fetter, Jeff Davis, Itagaki Takahiro, Koichi Suzuki, Josh Berkus, Dave Page, Dimitri Fontaine, Marko Kreen, Michael Meskes, Joe Conway, Josh Tolley, Greg Sabino-Mullaine, Selena Deckelman, Stephen Frost, Robert Treat, Robert Haas, Magnus Hagander, Kohei Kaigai, Heikki Linnakangas, Tom Lane, Jam Wieck, Oleg Bartunov, Teodor Sigaev, Marc Fournier, Greg Smith, Greg Stark, Peter Eisentraut (via Skype)<br />
<br />
== Review of The 9.0 Development Process ==<br />
<br />
How did the commitfest work? Do we feel that the process worked in general, do we like Robert's CF application? What other parts of the process should we improve?<br />
<br />
David Fetter commented that the writable CTE patch went through more than one CF without adequate feedback, and the patch got rejected. Should we not allow things to be bumped, or not bumped twice? Still listed as open on November Commitfest. RH thinks feedback was provided but it might not have been very clear. It ''was'' reviewed more than once. RT: maybe we shouldn't be so quick to bump people in the last CF. Writeable CTE wasn't bumped until Feb. 10. Part of the issue is that it's a very complicated patch.<br />
<br />
JB feels that integration/testing needs to be more structured. Still amorphous. If we had more structure, maybe it would go faster. RH we had a lot of open items, we closed them an released Beta1. Agrees that we need more concrete critera. BMomj: we end up with a pile of really hard problems which we don't know how to fix. Even now can't fix max_standby_delay. Needs to be a fire under someone. Open items list is a big win, but doesn't show the scope of the problem. Didn't have anything on open items list until this AM for Beta. Need to reconstitute list.<br />
<br />
If stuff is on the open items list it stops release. DF: maybe we should have ratings of complex/not? Perhaps we need release manager to keep up on open items etc.? Or Beta manager. Not everyone knows what every open item is. Someone should track the list and see status. Need list to know what to work on. JB would like to put stuff on the open items list. If there's a thread on hackers put it on the list.<br />
<br />
How do we get all the big patches in the first commitfest or second instead of last? Assumes people are working while releasing. Why didn't HS get into first commitfest. Post-CF, prerelease is long and delays development, people take a vacation for 6 months. Also the CF reviews were not very good for big patches. CFs worked well for small/medium patches. But for big patches not so great. KNNGiST and WriteableCTE not so great. HS at least didn't get into last commitfest.<br />
<br />
How much of a problem is this issue going to be for 9.1? Do we have anything that large? Synchronous Replication. SQL/MED?<br />
<br />
Josh & Selena will track open items and make sure they get listed or tracked and resolved.<br />
<br />
== Priorities for 9.1 ==<br />
<br />
See [[PgCon_2010_Developer_Meeting#Development_Priorities_for_9.1|priority grid]] below.<br />
<br />
== Timeline for 9.1 ==<br />
<br />
Treat: Release in July, have an immediate commitfest of pending stuff. Will we release in July? If we're late, do we want to drop a commitfest and have shorter cycle? Maybe the development cycle should go, even if the release is delayed? Lane doesn't think we have enough manpower for that.<br />
<br />
Issue is that people are waiting 6 months to resume development. Are there enough reviewers, though? Maybe we could have a reviewfest. Haas thinks that we have manpower. Berkus likes the idea of a reviewfest, new people. Haas says that we could commit stuff or at least put it "pending commit". Smith says that pruning patches would be valuable. What's the main bottleneck of people? Maybe Kevin could run it.<br />
<br />
We would need to branch first. Which would involve backpatching. Branch on July 1, first CF on July 15? Or 15 and August 1? Tom Lane: if we're not close to releaseable by July 1, then it's not feasible. Frost would like to have reviewfest in June. We could ask for reviews right now. RRRs need more direction. Selena will help.<br />
<br />
In the future, do we want to start earlier? We should get more people to help with getting to beta. Get people on open items list. Put it on the commitfest app? Magnus: but that makes it closer to a bug tracker. Haas: cycle of work is different for open items. No, will use wiki instead. Next year we'll have an open items app.<br />
<br />
Doing early branching will also help with bitrot. And will help with people's work schedules.<br />
<br />
Plan is to start 9.1 development on July 15, and only delay if things blow up.<br />
<br />
Alpha releases unanimously good. We might want to branch them differently. Downloads weren't huge, 10s or maybe 100 per alpha. But practice found issues with packaging, build scripts, etc. Maybe we shouldn't create branches for them, though. We should just tag it. We just wanted it to say the right name. This is probably fixed. So, for 9.1 we'll have a patch for the tarball and not a branch. Discussion of checkout/tag/branch detailed ensued.<br />
<br />
CFs need to have enough reviewers. Need to recruit more? Need to make it clear what's in it for reviewers. Reviewers should be nominated for minor contributors.<br />
<br />
Actions: <br />
* Selena to get reviewers to start now.<br />
* Branch on July 1, CF July 15.<br />
* Announce as plan for schedule.<br />
* No more branching for alphas.<br />
<br />
== Performance QA/Performance Farm ==<br />
<br />
Last year we took this as an issue at the meeting. Holdup was pgBench needed overhaul; results were useless on Linux. New pgBench should resolve those problems. Got something in pgBench tools which tries to figure out number of threads. Other thing which has been moving along well is benchfarm, and how should systems be set up to give reasonable performance. Greenplum has nice utility called gperftest, people need to test hardware before running pgbench. Nobody will let us benchmark high-end machines and talk about the results. Smith has some new high-end machines to test performance results.<br />
<br />
Smith: we are ready to write a spec for a performancefarm client. Need to build client for this. Frost has an intern to work on Postgres stuff will be working on performance farm client. Will be working for 8 weeks.<br />
<br />
Performancefarm also needs to run a battery of individual operations for performance regressions. Also needs to run a quick hardware/OS test for comparability. Need a general framework; maybe we'll eventually add DW test or TPCH. <br />
<br />
Why do we keep the same dependancy restrictions as buildfarm? It's easier to get clients that way. If we can tell people that they can just add the PerformanceFarm to the buildfarm, it's easier. Will go to assembled tool very soon. Data collection will start with 9.0 because of old pgbench. Biggest thing is to notice if someone's patch torpedoes performance.<br />
<br />
Propose that machines for the PerformanceFarm be named after plants. ;-)<br />
<br />
What about replication performance? Too big to take over.<br />
<br />
Actions: Stephen's intern to develop PerformanceFarm application. Will need help from Dunstan/Drake etc.<br />
<br />
== Advanced Security Features ==<br />
<br />
KaiGai's Presentation <link?><br />
<br />
We try to load something externally to make access control decisions. Row-level access controls have a number of issues. <br />
<br />
PostgreSQL currently has logic & access controls in the same place. (1) rework external check using same flow. (2) add label support. (3) add SELinux support. New method will have clear separation between Postgres and SELinux, possibly using a loadable module. <br />
<br />
Rework of access controls needs to do all of the access control checks at once instead of one at a time with query in between. Need to do one object at a time because otherwise it's too big. That way patches are only 200-500 lines each.<br />
<br />
Finally, add security labels to objects.<br />
<br />
The concern with the rework was that moving all of the security checks into a separate area was that that area needed to have knowledge about everything. Haas: need to provide a clean interface to security providers, but not by changing huge amounts of code. Heikki: it's not that big, it's fairly mechanical. <br />
<br />
Currently we check some basic things (like does the table exist) and later we check fine-grained permissions. Completely isolating it not possible. Locks for one thing. Also it's difficult to have a clean API because the API needs to know about everything. Kaigai says that generalized interface isn't necessary, Linux has had to add to the API with each new security provider.<br />
<br />
Why can't we put calls in the current aclcheck? Too low-level, don't pass enough information to them. We could pass more. But if we have the OID, we can look up all of the class information. Right now we have duplicate permissions checks all over the code. And the checks we need to do are not necessarily the same checks which SE wants to do.<br />
<br />
Smith: what users want isn't necessarily what we have in the patch. Maybe we should just build a subset of functionality, a lot of people don't care about DDL etc. We could implement only SELECT security, it would make the patch more digestible. All permissions checks for SELECT are all in one place. Or DML only.<br />
<br />
Does the information provided supply enough? It has to be because it's the first stage. It's basically the information the user entered. <br />
<br />
Is DML-only enough? Will it leak? Of course. Anyway, it's useful simplication. Smith says 95% of use cases are solved by that. Table discussion of all of the ACL checks. Kaigai says that checks are the same for DML and DDL, but others do not agree.<br />
<br />
Security Label discussion. Access control decisions operated by Subject, Target, Action. Label replaces Target. Syntax introduced, ALTER ... SET WITH SECURITY LABEL, SECURITY LABEL TO label. Simplified suggestion, just add seclabel[] text to catalogs. But wastes space and hurts peformance. <br />
<br />
Should SELinux be in core or be loadable module? <br />
<br />
Actions: Kaigai, Stephen, Smith, etc. to get together at pgCon and hash out some more issues.<br />
<br />
== CVS to GIT ==<br />
<br />
Its probably clear that we should change to a new VCS, and it should be GIT. No disagreement. <br />
<br />
What are the gating factors to moving now? Let's make a decision to do it, and when and we'll fix the issues. Problem with buildfarm has been solved. Buildfarm now runs git. Building Git on any older platform is impossible; bad make files. Getting all buildfarm members running Git wouldn't be possible, but we can run CVS mirror for older ones.<br />
<br />
We have a checklist on the wiki already for switching.<br />
<br />
Most buildfarm members will run either; it's a config item. We'll track which ones are using the emulator. <br />
<br />
Building older versions may have issues to build identically. Magnus claims that it's been fixed. How much do we care about old tags? There are still a couple of bad files but they're minor. Do we still have old issues with CVS? Marc says they're fixed, shouldn't show up in Git history.<br />
<br />
Are commit e-mails an issue? No. But e-mails will look different. Tom wants them to just work the same. <br />
<br />
We don't need to solve technical issues here. Just pick a date. We'll know when we're doing it and that everything will suck for a month afterwards. Will need to be a low-stress time for the project, between commitfests. Tom isn't sure how to apply commits across multiple branches. Discussion of Git details hashed out.<br />
<br />
Two issues: sheer space usage. Second, management of commits. But these are not serious problems. Andrew has checklist. Will need to test stuff and decide how to do specific stuff. Suggestion on date: after 2nd commitfest. No, halfway after first commitfest. No, immediately after first commitfest ... August 20th or similar.<br />
<br />
We will have git super-master which synchs to git.postgresql.org. Can do receive hooks. Have we considered using github? Github should not be canonical source, in case they go away. Can't do postcommit hooks on Github. People can just do both. Forking Postgres repo puts you near their limit. Put off Github questions.<br />
<br />
Issue: what about the name? People will need to reclone, will be part of suckitude. Rename old repo and create new repo. Where will secret master repo be? Maybe Conova.<br />
<br />
Mapping usernames onto e-mail addresses could be a pain. Maybe we should standardized onto committer@postgresql.org. Committers should pick names before conversion.<br />
<br />
Discussion about commit messages, merges, commits, etc. ensued.<br />
<br />
Action: <br />
* Magnus to set up emulator.<br />
* Andrew to publish checklist of how to set up your Git<br />
* Move to Git August 17-20: Magnus, Haas, Dunstan. Frost will be out.<br />
<br />
== Lunch ==<br />
<br />
== Clustering ==<br />
<br />
=== Snapshot Cloning ===<br />
<br />
Koichi: had meeting in Tokyo, and make a list of core APIs which clustering projects could use. Snapshot cloning is one such, plus it's useful for parallel query and parallel pg_dump. First use snapshot cloning to enforce consistent view of the database. Has already implemented this for PostgresXC. The same thing could be applicable for single PostgreSQL. It is a very simple implementation, and should not produce resource conflicts.<br />
<br />
For parallel pg_restore, maybe snapshot cloning will not be sufficient. Cloning the snapshot for read-only transactions is simple, not for write transactions. <br />
<br />
Smith: Using this for parallel query also works for read-only cloning.<br />
<br />
Very useful for dumping partitioned tables, with one backend for each partition. <br />
<br />
Added API to libpq. But shouldn't this be a server-side command? For cluster usage, it was useful for it to be in libpq. RH: one idea is a function we could call, and the shared snapshot would use a "cookie". Joachim W. wrote a patch with publish/subscribe. Needs to be all server-side. <br />
<br />
Tom has suggestion for simpler implementation, without locks. That is, you just need to have same snapshot start, not shared snapshot. Snapshot would die once the original transaction was gone. Koichi: this is not a problem.<br />
<br />
Tom: maybe we could just use a prepared transaction, which would keep the snapshot valid. Proposing to begin with read-only implementation.<br />
<br />
Action: Koichi to extract patch from PostgresXC and submit.<br />
<br />
== XID Feed ==<br />
<br />
PostgresXC needs to have a transaction run on multiple servers in the same cluster. The XID is needed so that you can have the same transactions. Will also be useful for parallel write operation, but that's really complicated. Parallel backend needs to be assigned same XID, but locks, resource conflicts.<br />
<br />
Heikki: let's start with parallel read queries.<br />
<br />
JB: parallel write on one server is a different feature than XID feed for clustering.<br />
<br />
Multiple backends share the same XID so they can share the same snapshot. If you're doing a multimaster update across multiple servers so you can maintain serialization. Stark explains multi-server deadlock situation.<br />
<br />
The XID is not the issue, it's the commit order. But communicating the xids means that you don't need to communicate more data to the servers. Just maintaining transaction IDs isn't enough, we need to maintain commit/abort info. If you want a snapshot which is valid on both nodes, you'd have to lock the procarray on both. You'd have to have a single global transaction manager controlling commits.<br />
<br />
What is the core feature here? You might want to make a specific instance of Postgres the global transaction manager. Or you might make one postgres a consumer of snapshots. Heikki: you could interrogate each node about what transactions were running at the time of the snapshot. Some discussion without agreement.<br />
<br />
Koichi explains how snapshots are distributed in PostgresXC, they receive them with XID. There's no negotiation between nodes. What stability would this affect with core Postgres? Vacuum and analyze need their own GXID. <br />
<br />
What is the feature: getting XID and Snapshot from PostgreSQL. Is this useful for core Postgres? Does it work for other cluster systems other than PostgresXC. Would be useful for all synchronous multi-master replication. Like Postgres-R. Or any distributed databases. Should be done as a "hook". Not really different from two-phase commit, but not testable without an external manager, which is the main problem. How could you test it?<br />
<br />
What other things do you need? What other hooks would we need in core to support GTM and other clustering functionality? If we had SQL/MED working, you could export XIDs to remote tables. But we don't have that yet.<br />
<br />
A hook will be fine.<br />
<br />
Action: Koichi to come up with proposed patch design<br />
<br />
=== General Modification Queue ===<br />
<br />
Marko: one use case is transactional que. Have some sample imentations in pgQ and Slony. Two different stragies: Slony/Londiste, and Josh wants to replicate data to external non-PostgreSQL tables. Josh is mainly concerned about write overhead, but no way around WAL.<br />
<br />
What is not solved by current LISTEN/NOTIFY? What this has is potential for really improving. Both Slony and pgQ rely on being able to filter out blocks of events and serialized sequence of individual events. Problem is eventID sequence number cannot be cached, that causes painful overhead. Both systems come up with insert/update/delete statements which go by index scan.<br />
<br />
If we can support general functions where a trigger can hand in old and new tuples and the receiver can get something which allows it to pull new data. Seems like commit order is the issue. Why do you need a sequence which can't be cached? If you knew what order they committed ... you wouldn't need a global counter. Jan isn't sure this makes sense for core because of lack of version independance.<br />
<br />
If you had a stream of commit information, you'd just have to buffer it. But that could work. The real missing piece is a commit ordering stamp, which the database should supply for you. This was a requirement of Postgres-R as well, they need to know what order to apply the writeset in.<br />
<br />
We could use the LSN of the commit record as that number. In the CLOG, for a range of XIDs, we have some LSNs. But it's not enough information. It sounds like all that's really needed is to have a way to grab LSN numbers. Maybe write it to a separate file.<br />
<br />
Commit-order table would need to be truncated. The clients have to send message about being done with it. Do we want to call gettimeofday while holding walinsertlock? Tom: we already are, but it's not exact enough. LSN plus approximate timestamp would give you order.<br />
<br />
Clients would need to look and say grab all transactions between one LSN and another.<br />
<br />
Action:<br />
* Develop specification for commit sequence / LSN data<br />
<br />
=== DDL Triggers ===<br />
<br />
Jan: it's a feature we've been missing for at least a decade. Jan starting to work on it, but DDL code is very messy. It's in tcop_utility.c function process_utility. The mess is that while the function gets a query string some calls don't put a real query string in there. <br />
<br />
Purposes include enforcement of complex CREATE requirements. Also replicating DDL to replicas. Wouldn't you like it better if the data were passed to the trigger using some kind of structured data rather than a query string. Take node structure of utility statement and create query string which can be passed, as well as passing node structure using nodetostring.<br />
<br />
We haven't exposed nodetostring because it's not a stable API. But generally changes there indicate changes in features. But if we could give the trigger an object name. Maybe we could pass before-and-after snapshots.<br />
<br />
How do Oracle and other systems get this data?<br />
<br />
Also there's an issue that some utility statements call other utility statements. <br />
<br />
Nodetostring exposure was also vetoed for other reasons. Slony and other systems can take a tree instead. Maybe we should already have a utility function. We already theoretically have a hook, but it's not being used. And also still a problem with recursive calls.<br />
<br />
pgAdmin wants a notification for changes. Would need some notification with data about object changed. But just object changed would be enough. We could also build up a set of events for DDL changes over time based on the tree ... we can start with just objectype and objectID. And type of modification: create, drop, alter.<br />
<br />
ProcessUtilityHook is there, but the problem is how it's exposed to the user. Hooks aren't used consistently and you don't know who has set it in what modules. Several people are already using it. Ordering becomes a problem. People don't want to use the hook.<br />
<br />
We also want a userspace implementation. Like maybe a trigger.<br />
<br />
Action:<br />
* Wiki page to be updated with spec by Jan et. all.<br />
<br />
== Status Report on Modules ==<br />
<br />
Slides from Dimitri<br />
<br />
Many issues and topics. Talking today about dump/reload support. If you dump and restore, you don't want to restore objects. We want to support any source language. We want to support custom GUCs and versions in extensions. We also want upgrading factilities.<br />
<br />
We are not going to talk about schema. We are not going to talk about souce level packaging, ACLs, PGAN or OS packaging and distribution. Example of extensions/foo/control. Should be in user/share. Control file will have name, version, custom_variable_classes.<br />
<br />
Then you can just do install extension foo, drop extension foo. pg_restore would call install extension foo and not its objects.<br />
<br />
Need dependancies on extensionID in pg_depends so we know what belongs to the extension.<br />
<br />
Used name = value because we already know how to parse them in control file. pg_dump will be easy, we will know how to exclude based on dependancies. uninstall.sql files will be replaced by this. <br />
<br />
What do we do about user-modifiable tables which are associated with a module? This is similar to how debian deals with config files. Or we could allow install files have items which aren't tracked as part of the module, and pg_restore would need to know about that.<br />
<br />
Should extensions have different versions or different names per version? The install script is just a sql file, you can add a DO script. Debian handles this by checking if the configuration is the default and replacing it, or failing over to the user afterwards. <br />
<br />
Need license information in the control file.<br />
<br />
We probably just need to punt configuration tables to version 2.<br />
<br />
Will this help pg_upgrade? Maybe. Right now you have to migrate shared libs yourself. Will fix some cases.<br />
<br />
Action:<br />
* Dimitri to do patch<br />
<br />
== SQL/MED ==<br />
<br />
Slides by Heikki.<br />
<br />
Heikki: in EDBAS, we already have foriegn tables. CREATE FOREIGN TABLES syntax. EDB has libpq, Oracle and ODBC. Shows slide with join plan; currently materialize locally.<br />
<br />
Have to decide what plans you can push to the remote database. Not all remote sources can handle all structures, including functions, joins etc. Even between PostgreSQL ruleutils.c is different for each version.<br />
<br />
Proposed FDW planner API. Pass parse tree, needs to say what it can take. How do we not duplicate the entire planner in the API. EnterpriseDB has been working on this but company has not committed to contributing it.<br />
<br />
Jan mentioned that at least a wrapper has to take a scan. <br />
<br />
Itagaki: didn't know EDB was also working on SQL/MED. Code is probably completely different. There's four issues: any features it should consider about. Currently considering dbLink and COPY FROM. Maybe there should be other features, like GDQ. <br />
<br />
Josh: what about PL/proxy? SQL/MED should support the functionality of PL/proxy. <br />
<br />
What is the best design to support access methods for tables? Postgres AMs for indexes, we need AMs for FDWs. Update is a problem, may start with select and insert.<br />
<br />
How to push WHERE conditions to FDW? Itagaki's WIP code uses internal tree, requires C code in FDW to parse. Might be unstable for that reason. External server might not be SQL server, so passing SQL isn't that useful.<br />
<br />
FDW vs. SRF ... can we merge these somehow? Current implementation of functionscan is to materialize. We didn't do value-per-call because it was difficult with PL/pgSQL. So we just materialized it.<br />
<br />
Heikki suggested that we don't use the FDW API from the SQL committee because nobody uses it. Supplies function for reaching into planner but can't imagine how it would work. EnterpriseDB implements pipeline.<br />
<br />
Also, an issue is cost estimation. How do we know how much it would cost. Statistics on remote tables? We could store them. It would also be nice to have joins take place on the remote server, but that's version 2 or 3. <br />
<br />
Also what about indexes on the remote server. One implementation in Japan has CREATE FOREIGN INDEX. Creating definitions locally of remote schema are very useful. DB2 implemented this and had commands to define foreign objects.<br />
<br />
Maybe we don't need to be really smart about this in the first version. But people are asking for this. Pass whole query to remote server. Wouldn't work for joins, but would work for union query. Defining an index on a FDW doesn't make much sense since we don't know what the costs are like over there. We should just have a function in the API.<br />
<br />
How would we recognize that we want to do the join on the remote server. We just need costs from the FDW. But we could also keep them in pg_stats. FDW might be used to access complex database like Infobright or Bigtable. <br />
<br />
Would EDB contribute a patch? Or just the rights for Heikki to steal bits? Patch from EDBAS would not work. Does the EDB patch help Itatgaki? EDB might get Itagaki access to the code.<br />
<br />
Action:<br />
* EDB to decide on opening code or not<br />
* Review Itagaki's git repo code: Heikki, Peter<br />
* Itagaki to keep working on API -- what about Peter?<br />
<br />
== pg_migrator ==<br />
<br />
Bruce: Just fixed a but in pg_upgrade for XID wraparound. Added docs. Migrator tries to rebuild a plane in the air. Bruce feels like he has a swiss army knife with pg_upgrade. Issue with FrozenXIDs in template0 fixed. Still a work in progress, which is why it's in contrib, still a work in progress but bugs are fixable.<br />
<br />
Will still have issues with page format, binary format changes. Haven't had those for a while, may not need them, and a dump & restore every 4-5 years is an improvement. Bunches of people have used it to migrate. The tool makes sure that it doesn't corrupt your data, it goes back if it hits an error.<br />
<br />
Stark: In the past we've had a chicken-and-egg issue if people wanted to make changes to the data format, we'd need a conversion function. And there's no hooks in pg_migrator. Could slow down the pace of adding features. We should add the functionality to do the conversion now.<br />
<br />
pg_upgrade from EDB has hook for COPY mode for conversion. And then what's the point of pg_migrator? If you don't have to rebuild indexes. But a lot of times you'd have to. But to have it in place so that it's not a hurdle for a data change. If we're going to do that we need to do it for 9.1 early in the cycle.<br />
<br />
Dump and reload is impossible for Smith's customers. page checksum thing is a good example. Read-on-conversion is a big performance hit. Background process to convert all old data. <br />
<br />
Also issue around internal representation of data type change. We either have to convert it, or have versioned data types.<br />
<br />
Is binary data conversion really faster? Read back one version, and convert while running. Something which can read one version level back and can convert is the only viable way. Would have a daemon to convert data. What about split pages, bigger page headers.<br />
<br />
CRC would be a good test for this. It's a small patch but has a lot of common upgrade issues. And it would be only one patch so we could put it off for another version if we had to. Would like to get this started for 9.1. CRC requires dealing with the split pages issue.<br />
<br />
Really tricky stuff includes changing indexing structure. Would need concurrent rebuild for unique indexes and primary keys. REINDEX concurrently was a deadlock concern. <br />
<br />
What about writing? Convert-on-read. <br />
<br />
Action:<br />
* Document what the plan is to do a conversion upgrade (Greg Smith)<br />
* Copy Zdenek's code (Greg Smith)<br />
<br />
== Other Business ==<br />
<br />
Treat: Mammoth Replicator into Core? Code has been BSD-licensed. Has features which don't exist in Hot Standby. Code is pretty big. Alvaro would say that it's not in great shape to contribute at this point.<br />
<br />
Has log-based replication including per-table. Has its own logs, and has binary vs. SQL replay. Not trigger-based. <br />
<br />
Real question is would we consider putting any replication solution into Core? Not until we've finished digesting HS/SR. Replication is one word for a dozen different solutions for 3 dozen problems. We've accepted one which solves some problems. Page things we should consider each case on its merits.<br />
<br />
Question is what parts of Mammoth make sense to be in core. Bruce thinks that mammoth is so tied into the backend that we couldn't accept it. It's too complicated. It doesn't sound like Mammoth offers enough functionality to make it worth it.<br />
<br />
Does mammoth need to be in core? It has grammar changes. We now have better ideas of what replication requirements are, we may have more commonalities in core in the future. Part of the reason have so many because we don't have one in Core.<br />
<br />
We'd consider more replication in core, but maybe not Mammoth.<br />
<br />
Action:<br />
* None<br />
<br />
== Other Business ==<br />
<br />
Peter says we're almost compliant with SQL 2008. Is it of PR value to comply with the remaining random stuff? People don't think so. <br />
<br />
What about Case Folding? We probably don't want to fix that. Wasn't on Peter's list, which is just features. Case Folding would break a lot of stuff. Thought Peter was already doing that with a couple of features per version.<br />
<br />
Compliance is only of moderate value. Is there a point of implementing the features on Peter's list?<br />
<br />
Summer of Code: how is it going?<br />
<br />
Haas has concern. His student is working on Materialized Views. Has proposed a very basic implementation. Might not be able to do even that. We can fail people.<br />
<br />
Selena just updated the open items on the mailing list. A lot of items were not closed on the mailing list.<br />
<br />
What about max_standby_delay? Tom wants to go back to just a boolean. What about Tom's original proposal? Can't really do it. Tom wants to remove time dependancy. What are the issues with max_standby_delay? Idle time on the master uses up time on the standby. Plus NTP and keepalives and some other stuff.<br />
<br />
Major discussion on max_standby_delay ensued.<br />
<br />
== Development Priorities for 9.1 ==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Feature<br />
!Developers<br />
!Notes<br />
|-<br />
|MERGE/UPSERT/REPLACE<br />
|GSoC with Greg Smith/Simon<br />
|Issues with predicate locking<br />
|-<br />
|Synchronous replication<br />
|Fujii/Zoltan/Simon<br />
|Review by Heikki<br />
|-<br />
|Improve Hot Standby/Streaming Rep usability<br />
|Simon/Fujii/Greg Smith<br />
|Review by Josh Berkus<br />
|-<br />
|Snapshot cloning API<br />
|Koichi<br />
|Sample app is parallel pg_dump<br />
|-<br />
|Locale/encoding <br />
|<br />
|per column/per operator collation<br />
|-<br />
|User exposed predicate locking<br />
|Simon<br />
|Interaction with serialization<br />
|-<br />
|[[Serializable]]<br />
|Kevin Grittner/Dan Ports<br />
|<br />
|-<br />
|pg_upgrade in core<br />
|Bruce<br />
|<br />
|-<br />
|External security provider<br />
|KaiGai<br />
|<br />
|-<br />
|Row-level security<br />
|KaiGai<br />
|<br />
|-<br />
|Writeable CTEs<br />
|Marko Tiikkaja <br />
|<br />
|-<br />
|SQL/MED<br />
|Itagaki<br />
|<br />
|-<br />
|Generalized inner-indexscan plans<br />
|Tom Lane<br />
|<br />
|-<br />
|Re(?)plan parameterized plans with actual parameter values<br />
|Tom Lane<br />
|<br />
|-<br />
|COPY as a FROM clause<br />
|Andrew Dunstan<br />
|<br />
|-<br />
|Pipelining/value per call for SRFs<br />
|Joe Conway<br />
|<br />
|-<br />
|Partitioning implementation<br />
|Itagaki<br />
|<br />
|-<br />
|Index only scans<br />
|Heikki<br />
|<br />
|-<br />
|Global temp/unlogged tables<br />
|Robert Haas<br />
|<br />
|-<br />
|Inner join removal<br />
|Robert Haas<br />
|<br />
|-<br />
|Extensions<br />
|Dimitri<br />
|<br />
|-<br />
|Range types<br />
|Jeff Davis<br />
|<br />
|-<br />
|Materialized views<br />
|GSoC+Robert Haas<br />
|<br />
|-<br />
|JSON data type<br />
|GSoC<br />
|<br />
|-<br />
|DDL Triggers<br />
|Jan<br />
|<br />
|-<br />
|Leaky view security<br />
|<br />
|<br />
|-<br />
|KNNGist<br />
|Teodor<br />
|<br />
|-<br />
|Performance farm<br />
|Greg Smith<br />
|<br />
|-<br />
|Git<br />
|<br />
|<br />
|-<br />
|[[PGXN]]<br />
|David Wheeler<br />
|<br />
|}<br />
<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2011_Developer_Meeting&diff=37566PgCon 2011 Developer Meeting2023-02-10T08:43:18Z<p>Alvherre: </p>
<hr />
<div>A meeting of the most active PostgreSQL developers and senior figures from PostgreSQL-developer-sponsoring companies is being planned for Wednesday 18th May, 2010 near the University of Ottawa, prior to pgCon 2011. In order to keep the numbers manageable, this meeting is '''by invitation only'''. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
This is a PostgreSQL Community event. Room and refreshments/food sponsored by EnterpriseDB. Other companies sponsored attendance for their developers.<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be from 9AM to 5PM, and will be in the "Albion B" room at:<br />
<br />
Novotel Ottawa<br />
33 Nicholas Street<br />
Ottawa<br />
Ontario<br />
K1N 9M7<br />
<br />
Food and drink will be provided throughout the day, including breakfast from 8AM.<br />
<br />
[http://maps.google.ca/maps?f=q&source=s_q&hl=en&geocode=&q=novotel+ottawa&aq=&sll=49.891235,-97.15369&sspn=36.237851,79.013672&ie=UTF8&hq=novotel+ottawa&hnear=&ll=45.421528,-75.683699&spn=0.036869,0.077162&z=14&iwloc=A&layer=c&cbll=45.425741,-75.689638&panoid=Z4FUGnkZkdHAOkIxyjjS9Q&cbp=12,25.83,,0,-0.6 View on Google Maps]<br />
<br />
== Attendees ==<br />
<br />
The following people have RSVPed to the meeting:<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus<br />
* Jeff Davis<br />
* Selena Deckelmann<br />
* Andrew Dunstan<br />
* David Fetter<br />
* Marc Fournier<br />
* Dimitri Fontaine<br />
* Stephen Frost<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Alvaro Herrera<br />
* Tatsuo Ishii<br />
* Marko Kreen<br />
* KaiGai Kohei<br />
* Tom Lane<br />
* Heikki Linnakangas<br />
* Fuji Masao<br />
* Bruce Momjian<br />
* Dave Page<br />
* Simon Riggs<br />
* Teodor Sigaev<br />
* Greg Smith<br />
* Greg Stark<br />
* Koichi Suzuki<br />
* Robert Treat<br />
* David Wheeler<br />
* Mark Wong<br />
<br />
== Proposed Agenda Items ==<br />
<br />
Please list proposed agenda items here:<br />
<br />
* Review of the move from CVS to GIT (Dave Page)<br />
* Build and Test Automation (David Fetter)<br />
* SELinux/PG Update and what-about-RLS? (Stephen Frost, KaiGai Kohei)<br />
* What to do about MaxAllocSize? (Stephen Frost)<br />
* Improving logging (Stephen Frost, David Fetter)<br />
* Changes to the Wire Protocol (David Fetter)<br />
* Slave-only based backups (Robert Treat)<br />
* Authorization issues (Alvaro Herrera)<br />
* Resource control (Simon Riggs)<br />
* User Defined Daemons (Simon Riggs)<br />
* Streaming SRFs and FDW WHERE clauses (Simon Riggs)<br />
* Schedule for 9.2 Development (Robert Haas)<br />
* Database Federation support (Koichi Suzuki)<br />
* Report from prior day's Clustering Summit (Josh Berkus)<br />
* Managing release schedules and patch submission processes (Dave Page)<br />
<br />
== Agenda ==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|08:00<br />
|Breakfast<br />
|<br />
|-<br />
|08:45 - 09:00<br />
|Welcome and introductions<br />
|Dave Page<br />
|-<br />
|09:00 - 09:15<br />
|Review of the move from CVS to GIT<br />
|Dave Page<br />
|-<br />
|09:15 - 09:45<br />
|Build and test automation<br />
|David Fetter<br />
|-<br />
|09:45 - 10:15<br />
|SELinux/PG Update, and what to do about RLS? [http://sepgsql.googlecode.com/files/pgcon2010-developer-meeting-kaigai-handsout.pdf (handout)]<br />
|Stephen Frost, KaiGai Kohei<br />
|-<br />
|10:15 - 10:30<br />
|What to do about MaxAllocSize?<br />
|Stephen Frost<br />
|-<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 10:45<br />
|Coffee break<br />
|<br />
|-<br />
|10:45 - 11:00<br />
|Schedule for 9.2 development<br />
|Robert Haas<br />
|-<br />
|11:00 - 11:20<br />
|Report from prior day's Clustering Summit<br />
|Josh Berkus<br />
|-<br />
|11:20 - 11:40<br />
|Improving logging<br />
|Stephen Frost, David Fetter<br />
|-<br />
|11:40 - 12:00<br />
|Slave-only based backups<br />
|Robert Treat<br />
|-<br />
|12:00 - 12:30<br />
|Resource control<br />
|Simon Riggs<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:30 - 13:30<br />
|Lunch <br />
|<br />
|-<br />
|13:30 - 14:00<br />
|Authorisation issues<br />
|Alvaro Herrera<br />
|-<br />
|14:00 - 14:30<br />
|DBT-2 I/O performance<br />
|Koichi Suzuki<br />
|-<br />
|14:30 - 15:00<br />
|Streaming SRFs and FDW WHERE clauses<br />
|Simon Riggs<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:00 - 15:15<br />
|Tea break<br />
|<br />
|-<br />
|15:15 - 15:45<br />
|Database federation support [http://sourceforge.net/apps/mediawiki/postgres-xc/index.php?title=Postgres-XC_code_contribution_to_PostgreSQL (related page)]<br />
|Koichi Suzuki<br />
|-<br />
|15:45 - 16:15<br />
|User defined daemons<br />
|Simon Riggs<br />
|-<br />
|16:15 - 16:45<br />
|Managing release schedules and patch submission processes<br />
|Dave Page<br />
|-<br />
|16:45 - 17:00<br />
|Any other business/group photo<br />
|Dave Page<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
| <br />
|}<br />
<br />
<br />
==Minutes==<br />
<br />
===Attendees===<br />
<br />
Dave Page, Alvaro Herrera, Kevin Grittner, Greg Smith, Dimitri Fontaine, Greg Stark, Bruce Momjian, Robert Treat, Heikki L., Mark Wong, Marko Kreen, Magnus Hagander, Selena Deckleman, Robert Haas, Tom Lane, Kaigai Kohei, Koichi Suzuki, Fujii Masao, Tatsuo Ishii, Oleg, Teodor, David Fetter, Simon Riggs, Stephen Frost, Jeff Davis, Andrew Dunstan, David Wheeler<br />
<br />
===Move To Git Recap===<br />
<br />
Robert, Stephen: things are good, nothing necessary. Andrew thinks that it would be nice to be able to pull a big patch and build. Fetter wants buildfarm to subscribe to alternate repos. Maybe we should bring back the Hudson server. Buildfarm isn't right for this. <br />
<br />
Real issue with Hudson was maintaining it. Nobody was using it or cared about it.<br />
<br />
Some projects haven't switched. pg-translation hasn't switched yet; being worked on now. Do we need to get other projects to switch? JDBC. CVS is a bottleneck. pg-translation needs some work, it's documented but nobody does it but Peter. Should we merge it with the main repo?<br />
<br />
Should we be mailing patches around? Haas finds it less work for him. Smith agrees. With branches it's hard to figure out diff. Could use github compareview. For now, we want people to submit the link to the patch and the github repo. Maybe add something to commitfest app with a link to repo?<br />
<br />
The real problem is that the archives mangles patches. Need to find out why mhonarc mangles patches. Inlining of patch text mangles them.<br />
<br />
===Build and Test Automation===<br />
<br />
What problem are we trying to solve? Performance regressions, for one. Also, proving performance gains from specific patches. <br />
<br />
Haas doesn't think we can find regressions. But we've definitely had cases. There are tools which automate specific query regressions. Josh can work on a tool for regression tests.<br />
<br />
Regression tests are a separate case than performance progress. pgbench doesn't test the things we care about very much. Greg Smith has build the first incarnation of a performance test farm. Has measured in-memory select-only performance. Greg gets 56K selects/second on an in-memory database for key lookups. Frost and Smith hacked up the the buildfarm structure to do performance tests. Once they have the code, people will provide hardware.<br />
<br />
Good progress on server-side code, but need to integrate with the buildfarm server for performance test results. Client part is done, but need server part. Andrew can put some time into integrating the server. Need to specify data structures. The webapp side isn't done.<br />
<br />
Bruce wants something which allows testing configuration changes and similar. Andrew has just completed a feature for the buildfarm which allows adding new modules, like running other things, pulling from repos, building drivers.<br />
<br />
What about cloud servers? Too much variability? In the future, we could measure things like CPU ticks to measure efficiency. But not now. Clouds could be used to test different memory sizes.<br />
<br />
===SE-Linux/PG===<br />
<br />
Kaigai prepared handout [link]. Last year tried to add many things, today will be more focused. Kaigai wants to share information from the SE-Linux community, and talk about row-level security.<br />
<br />
In the 9.2 development cycle, Kaigai wants to implement a userspace cache for authorization. The cache needs to be validated.<br />
<br />
The named type_transition rule to assign default labels for new objects, such as temporary tables. Doesn't require any core code according to Stephen & Haas.<br />
<br />
Leaky views and row-level-security (RLS). This is an old problem. Functions can be used to find things which the user isn't entitled to. This isn't just user-defined functions, but all functions. For example, division-by-zero errors can be used to expose numeric values. Or casting values to the wrong data types. So invisible columns aren't so invisible.<br />
<br />
Need to discriminate what are problematic and not problematic scenarios. For example, if you prevent pushing down quals to relations, it improves security, but you have to decide what to restrict. Heikki proposed that indexable operators get pushed down, and not anything else. We have 2000 system-defined functions.<br />
<br />
Tom thinks it's not worth trying to fix UDFs, since there's too many ways to circumvent. What about contrib modules? We could look at indexes on that specific table, those are the only ones we care about. We're trying to get a view which isn't leaky and has reasonable performance. So we need some push-down. <br />
<br />
We need infrastructure in the core to decide which push-downs are OK. We need the checking function to be applied before the qual functions get applied. Checking if there is an index which could use the operator wouldn't be workable according to Tom. We could check if the user has permissions on everything underlying the view, but this is not the use case we care about.<br />
<br />
Also, none of this gets us tagged RLS instead of views. For that matter, it would be good to have RLS which wasn't dependant on SE-Postgres. Haas suggests imposing mandatory filtering conditions per user. It could be using the same framework as security views. There's a big use-case for virtual private databases.<br />
<br />
Haas warns that this can be complicated for managing security for large numbers of objects. Predicate-based RLS can be used to implement label-based. Kaigai says that they can be reconciled. <br />
<br />
Could we use triggers to apply labels to rows? Kaigai says yes. Need to beware of multiple before-insert triggers.<br />
<br />
Stark suggests that the user could declare with quals can be pushed down on a per-view basis for security views. This is rejected as unworkable. Kaigai originally proposed supressing error messages. This only eliminates one side channel though, not all of them. Also, supressing error messages is a bad idea.<br />
<br />
There was a lot of further discussion about possible approaches to prevent side channels.<br />
<br />
Kaigai suggest that it's not worth pursuing covert channel supression or preventing probing. Heikki gave the example of a user/password table and they explored this a bit. It was suggested that rows you can't see be nullified. But this doesn't solve the qual push-down issue.<br />
<br />
Stephen suggested that we can push WHERE clauses down into set-returning functions. That would be useful anyway. This is like the FDW API, but it's different. Maybe we could just have FDWs to local tables. But it doesn't actually help.<br />
<br />
Kaigai summarized. We don't have a solution for leaky views, though.<br />
<br />
===MaxAllocSize===<br />
<br />
Issue is that we'd like to be able to allocate more than 1GB for some things. Hashtables, sorts, maintenance memory. Hash aggregates don't spill to disk. Stephen has a solution for this with doing additional palloc requests. He'll work on this soon.<br />
<br />
Currently palloc is fairly inefficient for vaccum; we palloc based on table size, so often we overallocate a lot. Maybe we could make multiple palloc calls, but that would increase overhead. This will be a bigger issue if users can allocate 8GB to vacuum.<br />
<br />
===9.2 Schedule===<br />
<br />
Discussion on pgsql-hackers didn't reach any conclusions. Could we make a decision about what the schedule would be here?<br />
<br />
Committers don't want to change the format of the CFs. When should the first CF start though? We could do the first one earlier than July 15 this year. But will that pull people off getting the release out? People are already working on 9.2 features anyway. Shooting for a slightly earlier branch/initial 9.2 CommitFest in June helps some with patch developer bit-rot, and may let developers who are focused on new features be productive for more of the year.<br />
<br />
If we want people to work on the 9.1 beta, we have to give them specific things to do. Most people don't know what to do for 9.1 now. And the list of open items hasn't been addressed.<br />
<br />
Part of the issue is that we don't have any formal structure to the beta process. We'd have a lot more to do then. <br />
<br />
Last CF of the release (January 15) is tough to reschedule usefully due to concerns about December/beginning of the year holidays.<br />
<br />
Work in August is particularly difficult to line up with common summer schedules around the world. Having the other >1 month gap in the schedule go there makes sense.<br />
<br />
Should we do more than four? Can't make that work. Hard to adopt without more active volunteers working on review (both at the initial and committer level) and an increase in available CF manager time. Should we reject large patches submitted for the last CF? Discussion of that later.<br />
<br />
The first CF goes very quickly, so we don't need to optimize for that. So the new schedule is:<br />
* June 15<br />
* September 15<br />
* November 15<br />
* January 15<br />
<br />
Need to publicize it this year, send to announce etc. Greg, Selena to update web pages.<br />
<br />
===Cluster Meeting Summary===<br />
<br />
See notes for [[PgCon2011CanadaClusterSummit|cluster meeting]].<br />
<br />
Addition: For parser export, it was suggested that the lexer is enough for a lot of cases. We just need to take the parts of the psql lexer and bundle it as a library. Or we could generalize the ECPG hack for scanning the grammar to support what pgPool needs.<br />
<br />
DDL triggers would also be useful for SE-Postgres.<br />
<br />
===Improving Logging===<br />
<br />
MySQL has the ability to log stuff to different files rather than all going to one big file which is nice. Stephen proposing making our log tag-based which get sent to specific files based on filtration. We need to decide a set of tags, and put multiple tags on each log line.<br />
<br />
Magnus mentioned that he proposed something similar which got blocked because of STDERR messages. We'd need to not send everything through stderr, then. The logging collector would need to accept data structures. Would also support third-party filters.<br />
<br />
One of the problems with this is that it makes the log_collector more complex, and thus less reliable. We could have a default log, though, and thus only add fix error messages a little at a time.<br />
<br />
Marko wants to send the log directly to a network syslog instead of a local syslog. <br />
<br />
Greg Smith thinks this is going down the wrong path, we're just making a bad system more complicated. He'd rather that everything go to a table. Or to pipes. Magnus has something working for logging to pipes.<br />
<br />
Tags are still useful, but where it goes is a separate question. Several people think that sending stuff to different log files is not that much of a problem. We could just split stuff into a default log and a tagged log. <br />
<br />
What Josh really wants is a table. There are issues with that.<br />
<br />
If we log to a pipe, then people can do what they want with the output. People will use the CSV format. Do we want to make the csv format configurable? Josh and Dave Page think it's not that useful to make it configurable. This is the lowest priority. The text of the query is the dominating factor.<br />
<br />
===Slave-only Base Backups===<br />
<br />
Pre-9.0 we could make backups based only on a slave for PITR. But we can't do that for streaming replication. The issue is that the marker for the ending location doesn't get sent over the stream. <br />
<br />
Treat wants to be able to take backups on slave machines without touching the master at all. We can't do this because we can't run pg_stop_backup on the slave. Heikki thinks this ought to work now. But this doesn't work when you want to promote the standby to a standalone.<br />
<br />
Treat will post his testing information to hackers. The real thing is to support pg_basebackup off the standby, but that requires cascading replication. Heikki and Treat discussed this problem for a while. The issue seems to be the backup labels. They will follow up on this.<br />
<br />
===Resource Control===<br />
<br />
How we control multiple queries executing in the same environment. We have issue with IO and memory. One issue is because work_mem is locally settable we can overallocate. Plus it's hard to count work memory. For disk IO, it's common to want to run large queries in a slow mode so they don't have a big query having an impact on shorter, more important queries. <br />
<br />
The way we solved that in Greenplum was resource queues. There are other possible implementations though. Global resource pools is how you did it in the old days. <br />
<br />
Jeff comments that our operators don't obey work mem even locally.<br />
<br />
Josh discussed that admissions control for queries at estimate time would actually work, or more that it would actually improve things. The issue is that we'd have to replan the query, which would be costly.<br />
<br />
Kevin suggested that we would queue queries rather than replanning them. This seemed generally a good idea, much better than replanning.<br />
<br />
We also don't track the amount of memory used, but we could do that. Drawing from the pool at estimate time appeals. <br />
<br />
One thing to minimize effects on disk-io is to do "query_delay" like we do vacuum_delay. Also for DDL operations, which can take a really long time. Greg Smith tried to build this once. The problem is how do we accumulate costs? The stuff in vacuum is pretty buried and nonportable.<br />
<br />
The main point is to get resource control on the agenda so that the idea doesn't get kicked off pgsql-hackers.<br />
<br />
Simon's experience in priorities is from Teradata (low, medium, high) is that that's a terrible model which doesn't work in practice. <br />
<br />
I/O and WAL traffic are resources we need to control. Replication delay is very spiky based on what's happening on the master, which is a problem due to data loss. Need to discuss on lists.<br />
<br />
DB2 allocates work_mem out of a shared pool. This is relevant to parallel query, because it would require shared memory for sorts.<br />
<br />
David Fetter wants to look at sort algorithms for SSD or ramdisk. Discussion about algorithms ensued.<br />
<br />
===Lunch===<br />
<br />
Sandwiches, salad, cake.<br />
<br />
===Authorization Issues===<br />
<br />
Are we able to drop priviliges at appropriate times? One thing is that the SQL standard does not have RESET ROLE, so they don't cover this problem. SET LOCAL is limited to the current transations but not to subtransactions. Security definer functions which call ordinary functions after a SET ROLE don't work correctly ... they can RESET to the higher ROLE.<br />
<br />
One possibility is to have an actual stack of ROLEs. We could extend what we did for autovacuum.<br />
<br />
Alvaro suggested removing RESET ROLE. Or disable it within Security Definer. You'd need a stack for the current session. <br />
<br />
This is definitely an issue. Haas suggested making RESET ROLE a protocol-level call. Marko objected that you don't want to be invalidating the cache every time you change ROLE.<br />
<br />
===DBT-2 I/O Performance===<br />
<br />
A comparison of DBT-2 with PostgreSQL against certain other databases. Our performance is superior on many real-world cases, but the other database really outstrips us on DBT-2. One critical issue is the amount of IO we do.<br />
<br />
Particularly, the amount of WAL writing we do is almost three times as much. Full_page_writes is one big cause of the additional logging. Turning off full_page_writes (FPW) decreased WAL logging by 70% and increased throughput by 25%.<br />
<br />
We also do a lot more writing of the database files, and checkpoint sheduling improvements might help that.<br />
<br />
The DBT2 benchmark includes a table which is too big to be cached, so its dominated by I/O performance. Since the table isn't cached, full_page_writes are more frequent. TPCC spec only checkpoints once per benchmark run.<br />
<br />
Folks in the NTT group are worried about the I/O performance, and are nervous about using PostgreSQL in I/O-bound workloads.<br />
<br />
Have we considered compressing full_page_writes? No, not yet. We could test it with no WAL logging at all.<br />
<br />
One, we can make full_page_writes configurable per table. This doesn't work if we don't have a recovery strategy. One way to solve this is checksums. Also a problem is that we don't detect the corruption immediately, we'll have the corruption and not encounter it for a long time.<br />
<br />
Heikki suggested that if it was always safe to replay the log without FPW multiple times. This might bloat the logs just as much as FPW, though. The big issue is that FPW are occurring more often than we expect them to, and we should figure out why.<br />
<br />
Koichi wants to solve write order for checkpoints. This speeds up recovery by 5X. <br />
<br />
Stark suggests that the main issue is that our database is just larger, and that's the source of a lot of I/Os. We don't have the data for this.<br />
<br />
One thing we do is that when we log the WAL for a row, we log the whole row. We also don't have compressed indexes. <br />
<br />
DBT-2 is completely I/O bound, so you can sacrifice CPU to improve I/O, like by doing file compression.<br />
<br />
Our row header is 24bytes. This isn't a good area of performance.<br />
<br />
Archive logs are uncompressed. pg_lesslog + compression can shrink the archive logs by 85%. pg_clear_xlog_tail is more safe. <br />
<br />
Haas suggested writing only the page header and pointers instead of the full page. <br />
<br />
We can't assume that the OS is passing full 8Ks to the storage, which is why you can't turn off FWP even if you have BBWC. Heikki's suggestion would fix this, though. InnoDB does "double-writes", where full pages get written to a separate file. <br />
<br />
Various strategies were discussed for fixing full page writes.<br />
<br />
Tom liked the idempotent writes approach. The NTT group plans to work on these problems and submit some patches.<br />
<br />
Increasing the num_buffer_partitions might scale us to more cores. We've been hung up on lack of performance testing for these changes though. Kevin cited a real-world case where increasing drove performance down. There is also CPU sharing on the LWLocks, which means that we have a cacheline for each buffer partition. We could improve this.<br />
<br />
Why are the LWlocks in one huge array? Why not parititon them? One issue is looking up all the locks belonging to one partition. There was more discussion about possible LWLocks structures.<br />
<br />
We can't make this configurable at runtime because we'd lose a lot of compiler optimizations, athough there may be a way around that. We can't ignore the one-core use case though.<br />
<br />
===Clustering Tables Concurrently===<br />
<br />
For 9.1 Simon proposed a command which allowed you to switch which file a table was associated with at runtime. The use for that was clustering a table concurrently. The syntax of this switch operation was problematic. <br />
<br />
This is important because it affects the amount of I/O we do on large tables. For example, master/child table setups (like invoice and invoice_items) are very common. It's really good to keep child records together.<br />
<br />
The way you do this, is you replicate the database via londiste, you make all your changes to cluster on the copy, and recopy it over. But you need to capture all the intermediate changes. Simon wants a Queueing operation built into the core which will capture intermediate changes.<br />
<br />
Simon wants to put the queueing mechanism into core, for cluster concurrently, for materialized views, for replication. This isn't a queueing mechanism, so much as logical replication buffer. This is kind of like putting pg_reorg into core, only according to Treat pg_reorg doesn't work this way.<br />
<br />
Haas suggested that what we really want is index-organized tables rather than this whole mechanism. Heikki suggested that a background daemon which rewrites a bit of the table at a time. But that sounds like vacuum full.<br />
<br />
This is two patches, one is generalized queueing mechanism, the second is cluster concurrently. There was more discussion of alternatives. Koichi tested multiprocess clustering of tables, but it didn't work very well.<br />
<br />
===Database Federation Support===<br />
<br />
This is another proposal for contributing PostgresXC code. PostgresXC contains a bunch of code for doing queries on remote database nodes, including cross-node joins. We can submit patches, but they will be slow. For one thing, PostgresXC will be very busy with alternative development. Probably for 9.3.<br />
<br />
Code is appropriately licensed, just not documented very well. Maybe be some issues with old EDB code, but working that out. <br />
<br />
Haas mentioned the need to break this up into a lot of small patches.<br />
<br />
===User-configurable Daemons===<br />
<br />
User-defined functions could have a loop and execute for many hours, or forever. Basically what we need is just startup and shutdown. <br />
<br />
What about pgAgent? This is supposed to be something different. And pgAgent is C++/GUI tool. <br />
<br />
Another idea is the stored procedures we're talking about, which aren't one big transaction. We probably need that first. Some people want to write a helper in C do help replication or whatever. This sounds like writing your own application server.<br />
<br />
There was some discussion of how this relates to parallel query.<br />
<br />
This is essentially a backend which doesn't have a frontend attached to it.<br />
<br />
If you didn't have to have a single long-running transaction, then you could ad-hoc this with init scripts. That's ugly, though. You could have two processes with a pipe, Heikki suggested.<br />
<br />
Stark said that this whole thing could be developed as an extension. Listen/notify is a big use case for this.<br />
<br />
None of this works without a non-transaction backend code segment. That has to come first, but it's easier than general stored procedures.<br />
<br />
===Moving Contrib Around===<br />
<br />
Smith would like to move several contrib extensions into core. He actually wants to this for 9.1 for packaging. People don't like to run things in contrib because they don't trust them. Greg has moved six and they're so low impact that we could move them over even in beta.<br />
<br />
If you rearrange the source tree, it doesn't modify the packages at all. You need to modify the build scripts. The place where the extensions end up is /share/postgresql/extension. The PLs are there. Doing that is a matter of getting the packagers to change things.<br />
<br />
Some extensions are ready for production and safe, others are not. We need to break down contrib into a way sysadmins can understand it. Extensions with external dependancies need to be separate packages. The only contribs which have external dependencies are UUID, and a couple others. So we can add the stuff without dependencies.<br />
<br />
We need to clearly document this for the packagers. There is a precedent for this with Perl and Python etc. Dave, Magnus, and David Wheeler were talking about the PGXN site, and the Perl modules are a PITA. Users hate having loads and loads of packages. We do need to be careful not to oversplit it.<br />
<br />
We need to flag extensions as maintained by core postgresql. But sysadmins are not going to install anything optional.<br />
<br />
First we need to agree on the categorization, then we need to communicate that to packagers. Moving the source code around is a last step.<br />
<br />
Greg Smith already made a list of seven: auto_explain pageinspect pg_buffercache pg_freespacemap pgrowlocks pg_stat_statements pgstattuple.<br />
<br />
Adminpack is a borderline case. Should we allow it or not? Some people don't like it because you can write files. Probably defer for now.<br />
<br />
We need regression tests, documentation improvements, and Greg's patch. Not useful to set up a make target for the packagers. And we need documentation for packagers.<br />
<br />
Make Install should maybe build those seven as well. <br />
<br />
There was consensus to go ahead with Greg's patch.<br />
<br />
===Managing Release Schedules===<br />
<br />
Dave would like to make the schedule more regular without having crappy releases. Stephen suggested having more information about patches in the CF app. Haas thinks we should just have a time-based schedule, absolutely. Getting new features is not a problem. <br />
<br />
The problem is at the end of every cycle we end up slipping. Or at the end of the cycle we get into major bikeshedding on an issue right at the end of a schedule. We could have release manager who would keep things on schedule.<br />
<br />
Haas says the real problem is huge stuff getting submitted at the end of the cycle. We had 6 major features contributed for the last CF. The problem is people starting from scratch at the beginning of every cycle. People arrange their own schedules based on the time they have available. <br />
<br />
Fetter pointed out that no other DB releases a major release every year. So we're not doing that badly Haas suggested that he could stop watching the schedule. He doesn't like being the schedule jerk. <br />
<br />
Stark asked why we're landing large patches at the end instead of a bit at a time. Linux has a tree called Linux-next where they accept a lot of interim patches. Or you could bump the patches to the next release.<br />
<br />
Tom pointed out that 9.1 is coming out more-or-less on time. But that's partly because Haas and Tom were working on other people's code for 3 months. Collations also should not have gone in when it did. <br />
<br />
Selena suggested that we actually need a process for dealing with bad patches and reversing stuff. The problem with Collations was not that we wanted to reject it right away, we didn't know until later it was an issue and it would take a lot of effort to back out.<br />
<br />
We need to distinguish patches which need more review and patches which the author hasn't fixed. Haas is unhappy with system where some people's patches are treated differently.<br />
<br />
Simone pointed out that we also argue about patches until the deadline as well as people working until the end. Fetter proposed again that we have a release manager. Josh suggested that the schedule is not the problem, the problem is the use of Haas's and Tom's time. Haas points out that people argue against rejecting patches.<br />
<br />
There was a lot of further discussion regarding patch arguement process, which would continue over beer.<br />
<br />
===Other Business===<br />
<br />
Marc Fournier has retired from the Core Team. Marc is the fourth person who has left. It's really hard when people leave -- it's usually because their lives change. Marc was unbelievably important in providing the community with infrastructure at the beginning of the project, without with PostgreSQL could not have been successful. But he couldn't even make it to pgCon this year. We have a good infrastructure team, so the handoff is fine. We should thank him for the time when he was there when we had nothing. <br />
<br />
Marc will still be contributing to our infrastructure and we will be using hub.org for some things for a while. Mainly he's not going to be around day-to-day, and and won't be building the tarballs.<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2012_Developer_Meeting&diff=37565PgCon 2012 Developer Meeting2023-02-10T08:43:10Z<p>Alvherre: </p>
<hr />
<div>A meeting of the most active PostgreSQL developers is being planned for Wednesday 16th May, 2012 near the University of Ottawa, prior to pgCon 2012. In order to keep the numbers manageable, this meeting is '''by invitation only'''. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org). <br />
<br />
Please note that this year the attendee numbers have been cut to try to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.2 release cycle. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies, unlike in previous years.<br />
<br />
This is a PostgreSQL Community event. Room and refreshments/food sponsored by EnterpriseDB. Other companies sponsored attendance for their developers.<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be from 8:30AM to 5PM, and will be in the "Red Experience" room at:<br />
<br />
Novotel Ottawa<br />
33 Nicholas Street<br />
Ottawa<br />
Ontario<br />
K1N 9M7<br />
<br />
Food and drink will be provided throughout the day, including breakfast from 8AM.<br />
<br />
[http://maps.google.ca/maps?f=q&source=s_q&hl=en&geocode=&q=novotel+ottawa&aq=&sll=49.891235,-97.15369&sspn=36.237851,79.013672&ie=UTF8&hq=novotel+ottawa&hnear=&ll=45.421528,-75.683699&spn=0.036869,0.077162&z=14&iwloc=A&layer=c&cbll=45.425741,-75.689638&panoid=Z4FUGnkZkdHAOkIxyjjS9Q&cbp=12,25.83,,0,-0.6 View on Google Maps]<br />
<br />
== Attendees ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname):<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus (Secretary)<br />
* Jeff Davis<br />
* Andrew Dunstan<br />
* Dimitri Fontaine<br />
* Stephen Frost<br />
* Peter Geoghegan<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Shigeru Hanada<br />
* Hitoshi Harada<br />
* KaiGai Kohei<br />
* Tom Lane<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Dave Page (Chair)<br />
* Simon Riggs<br />
* Teodor Sigaev<br />
* Greg Smith<br />
<br />
== Proposed Agenda Items ==<br />
<br />
Please list proposed agenda items here:<br />
<br />
* Agree CommitFest schedule for 9.3 (Strawman from Simon)<br />
** CF1 June 15, 2012 - 1 month<br />
** CF2 Sep 15, 2012 - 1 month<br />
** CF3 Nov 15, 2012 - 1 month<br />
** CF4 Jan 15, 2013 - 2 months<br />
* Queuing [Dimitri, Kevin]<br />
** Description: efficient and transactional queuing is a very common need for application using databases, and could help implementing some internal features<br />
** Goals: get an agreement that core is the right place where to solve that problem, and what parts of it we want in core exactly<br />
* Materialized views [Kevin]<br />
** Description: Declarative materialized views are a frequently requested feature, but means many things to many people. It's not likely that an initial implementation will address everything. We need a base set of functionality on which to build.<br />
** Goals: Reach consensus on what a minimum feature set for commit would be.<br />
* Partitioning and Segment Exclusion [Dimitri]<br />
** Description: to solve partitioning, we need to agree on a global approach<br />
** Goals: agreeing on SE as a basis for better partitioning, having a "GO" on working on SE<br />
* MERGE: Challenges and priorities [Peter G]<br />
** Description: Implementing the MERGE statement for 9.3. It is envisaged specifically as an atomic "upsert" operation.<br />
** Goals: To get buy-in on various aspects of the feature's development, and, ideally, to secure reviewer resources or other support. Because of the complexity of the feature, early interest from reviewers is preferable.<br />
* Row-level Access Control and SELinux [KaiGai]<br />
** Security label on user tables<br />
** Dynamic expandable enum data types<br />
** Enforcement of triggers by extension<br />
* Enhancement of FDW at v9.3 [KaiGai]<br />
** Writable foreign tables<br />
** Stuffs to be pushed down (Join, Aggregate, Sort, ...)<br />
** Inheritance of foreign/regular tables<br />
** Constraint (PK/FK) & Trigger support.<br />
* Type registry [Andrew]<br />
** Provide for known OIDs for non-builtin types, and possibly for their IO functions too<br />
** Would make it possible to write code in core or in extension X that handles a type defined in extension Y.<br />
* Ending CommitFests in a timely fashion, especially the last one. Avoiding a crush of massive feature patches at the end of the cycle. Handling big patches that aren't quite ready yet. Getting more people to help with patch review. [Robert]<br />
* What Developers Want [Josh]<br />
** Description: a top-5 list of features and obstacles to developer adoption of PostgreSQL (with slides)<br />
** Goal: to set priorities for some features aimed at application users<br />
* In-Place Upgrades & Checksums [Greg Smith, Simon]<br />
** Description: Revisit in-place upgrades of the page format, now that pg_upgrade is available and multiple checksum implementations needing it have been proposed.<br />
** Goal: Nail down some incremental milestones for 9.3 development to aim at.<br />
* Autonomous Transactions [Simon]<br />
** Overview of idea, relationship to stored procedures<br />
** Feedback, buy-in and/or alternatives<br />
* Parallel Query [Bruce Momjian]<br />
** Hope to get buy-in for what parallel operations we are hoping to add in upcoming releases<br />
* Report from Clustering Meeting [Josh] (10 min)<br />
** Description: to summarize the discussions of the cluster-hackers meeting from the previous day<br />
** Goal: inter-team synchronization. Possibly, decisions requested on specific in-core features.<br />
* Double Write Buffers [Simon]<br />
** Is anyone committing to do that for 9.3?<br />
<br />
* Goals, priorities, and resources for 9.3 [All]<br />
** For roadmap and planning purposes, set expectations and coordinate work schedules for 9.3. Confirm who is doing what, identify interested reviewers at start, and check for gaps.<br />
<br />
== Agenda ==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|08:00<br />
|Breakfast<br />
|<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|08:30 - 08:45<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|-<br />
|08:45 - 09:15<br />
|Autonomous transactions<br />
|Simon Riggs<br />
<br />
|-<br />
|09:15 - 09:40<br />
|[[Queuing]]<br />
|Dimitri Fontaine/Kevin Grittner<br />
<br />
|-<br />
|09:40 - 09:50<br />
|Report from the Clustering Meeting<br />
|Josh Berkus<br />
<br />
|-<br />
|09:50 - 10:10<br />
|Type registry<br />
|Andrew Dunstan<br />
<br />
|-<br />
|10:10 - 10:30<br />
|Access control and SELinux<br />
|KaiGai Kohei<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 10:45<br />
|Coffee break<br />
|<br />
<br />
|-<br />
|10:45 - 11:15<br />
|Enhancement of FDWs in 9.3<br />
|KaiGai Kohei<br />
<br />
|-<br />
|11:15 - 11:30<br />
|What developers want<br />
|Josh Berkus<br />
<br />
|-<br />
|11:30 - 12:00<br />
|Parallel Query<br />
|Bruce Momjian<br />
<br />
|-<br />
|12:00 - 12:30<br />
|MERGE: Challenges and priorities<br />
|Peter Geoghegan<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:30 - 13:30<br />
|Lunch <br />
|<br />
<br />
|-<br />
|13:30 - 14:00<br />
|Materialised views<br />
|Kevin Grittner<br />
<br />
|-<br />
|14:00 - 14:20<br />
|In place upgrades and checksums<br />
|Simon Riggs/Greg Smith<br />
<br />
|-<br />
|14:20 - 14:45<br />
|Partitioning and segment exclusion<br />
|Dimitri Fontaine<br />
<br />
|-<br />
|14:45 - 15:00<br />
|Commitfest Schedule<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:00 - 15:15<br />
|Tea break<br />
|<br />
<br />
|-<br />
|15:15 - 15:40<br />
|Commitfest management<br />
|Robert Haas<br />
<br />
|-<br />
|15:40 - 16:45<br />
|Goals, priorities, and resources for 9.3<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|16:45 - 17:00<br />
|Any other business/group photo<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
| <br />
|}<br />
<br />
==Minutes==<br />
<br />
== 2012 Developer Meeting Minutes ==<br />
<br />
Started with introductions.<br />
<br />
=== Autonomous Transactions ===<br />
<br />
Simon brought this to get some feedback on the idea. Autonomous transactions (ATX) are a transaction inside a transaction ... a new top-level transaction. In Oracle, it's not just one new transaction, it's a whole new context which can submit multiple new transactions. There is no connection between parent and child transactions, which can result in new types of deadlocks.<br />
<br />
Each new transaction context would allocate a new pg_exec from a pg_proc call. Implementation is straightforwards, just have to handle locking. Allows us to implement stored procedures in an interesting way. If we treat a stored procedure as an autonomous transaction, then this solves some problems. We can put COMMIT< ROLLBACK, other things in stored procedures. <br />
<br />
Tom suggested that ATX don't need to conflict with parent transaction locks. Noah pointed out some issues with that. We'd need to have a switch for Stored Procedures in order to indicate they are autonomous, like using CREATE STORED PROCEDURE. We'd be using an additional client slot for each ATX, which could be a problem. Oracle's limit on ATX is 70 per connection, which seems like a lot. Maybe we should try to hold them all to a single session like it was a subtransaction. Not sure if we can do this, Simons will need to take a look at is.<br />
<br />
ATX also need to eventually be able to run utility commands, like VACUUM and CREATE INDEX CONCURRENTLY. <br />
<br />
=== Queueing ===<br />
<br />
Ultimately the materialized views will need some kind of queueing. Once we have queueing in core, it could be generally useful. CLUSTER CONCURRENTLY would need it, or application queues will need queueing structure. We might want to have it exposed at the SQL level. You put things in the queue, and at commit, others can see it. LISTEN/NOTIFY is sort of a queue, but is only one item and vanishes if you're not listening.<br />
<br />
Like a table, but access semantics are different. Would need logged/unlogged queues. Some discussion about how queues are different from tables. Haas wondered about whether what we need for interal queues are the same as what users need for user-visible queues.<br />
<br />
Queue-tables also need different performance characteristics. We don't need queues so much as we need deferred action. We also need background processes which wake up and check the queue. Queues could be built on top of tables. Discussion about uses, designs for queues ensued.<br />
<br />
We need a really clear design spec for how queues would work. There are specific performance improvements we want for queueing, but they're likely to be just improvements on table performance. The idea is to have a generalized API instead of reinventing a bunch of times.<br />
<br />
Next steps is to collect use cases. [[Queueing|Kevin & Dimitri will collect use cases on a wiki page]], to design an API. Performance optimization needs to look at access pattern. Simon pointed out that this works similar to fact tables where you want to move stuff forward constantly. Users might not use queues as pure FIFO.<br />
<br />
Unlinking segments works for deleting from the beginning of a table but indexes could be a problem. Block numbers could be a practical problem, we might need wraparound, or reset-to-zero.<br />
<br />
=== Report on Clustering Meeting ===<br />
<br />
See [[PgCon2012CanadaClusterSummit|minutes]].<br />
<br />
=== Type Registry ===<br />
<br />
WIP idea. Hstores aren't build in, so they get an arbitrary OID, which causes issues with writing generic code. Looking up they type name is expensive. It would be nice to have a registry for types where people writing extensions are allocated an OID. Andrew gave example of hacking Postgres to support upgrading from the optional JSON type in 9.1 to the built-in type in 9.2.<br />
<br />
We need to expose the pg_upgrade stuff as well, set_binary_upgrade. Should we use something other than and OID? We need the OIDs for upgrade and for drivers. Driver identicalness isn't the same as pg_upgradability, so we might want two different switches for that. Maybe we should have a new OID if you change the storage of a type?<br />
<br />
What's the criterion for allocating an OID? We'll need some kind of judgement. We'll also need to block off the OID reserved space into sections. People generally found this to be a good idea. Andrew will create a wiki page and follow-up. We could just do this for contrib, but that's not really a good idea.<br />
<br />
We could have CREATE TYPE ... WITH OID = ###, for base types only. The folks who want it for ENUM etc. are just replication/clustering authors. There was discussion of other approaches to handling these problems. Users will create types with OIDs which conflict.<br />
<br />
=== Access Control and SE-Linux ===<br />
<br />
Several components: to add security around user tables. Second, to add additional conditions around user queries. Third, a condition around new tuples which are inserted. Fourth, we should have ENUMs to represent user-defined security labels. Did some performance testing on the last part, having labels as OIDs was much faster and closer to non-SE performance.<br />
<br />
There's concurrency issues around seeing new labels -- we'd have a huge issue with inserting the labels into the system table. Creating a new label could be a downtime event; we can have a utility command, and we can require users to create a new label first manually. But what happens if the new label isn't there? Should error just like a constraint.<br />
<br />
Is there a way to query SE-linux to get all of the security labels? That's hard, because it's four fields. The last field is an issue for prediction. There's a lot of value in having row-level security be completely type-agnosic; we just have a string and we don't care what's in it.<br />
<br />
An SE Label consists of: user, row field, type field, and (something inaudible). That last part is a kind of bitmap. Do we actually need that part, though? What's multi-category security, will we support that? How many different labels would you have on a specific table?<br />
<br />
The idea of row-level security is to force quals on people. Currently it's not transparent. The discussion on labels needs to continue elsewhere.<br />
<br />
Also we need to address FK and PK implementation for security labels.<br />
<br />
=== What Developers want ===<br />
<br />
PostgreSQL is becoming the default for many web applications like Ruby and Django. But there are plenty of users complaints. They don't show up on the PostgreSQL mailing lists. The developer complaints are on stackoverflow, forums for virtual hosting companies, and application specific lists like ORM/framework layers.<br />
<br />
Two categories of developer comments: blockers that cause to use another tool, and enhancers that would expand the market into new areas. Many of these are available features, but they seem to hard to use.<br />
<br />
==== Blockers ====<br />
<br />
1. Installation onto developer laptops (Windows / OS X)<br />
* Re-installs problematic in Windows<br />
* Reinstall of Redis is the competitor here, it is a closer to a true one-click installer.<br />
* People use Redis because it's "easy to install", while PostgreSQL ran into one of multiple problems (reported on lists like pgsql-general)<br />
* postgres.app is aiming at simplifying things for Mac developers, is in beta<br />
* Kevin: also seen issues with Rails + Rake, lots of questions on Stackoverflow.<br />
2. Complexity of configuring PostgreSQL, i.e. postgresql.conf<br />
* Shared memory issues on the Mac<br />
** Could use POSIX shared memory instead Sys V<br />
* Need a configuration generator and hints for settings that are set incorrectly<br />
**Example: need to increase size of the transaction log with pg_xlog having X GB of space. Math to determine settings like checkpoint_segements given a GB target is complicated.<br />
3. Better analysis and troubleshooting<br />
* Expose everything via SQL, i.e. autovacuum ; no parsing logs.<br />
* EXPLAIN needs to be easier to understand, suggest what needs to be done when planner mistakes are made.<br />
* Freeze a stable query plan needed for some apps.<br />
4. Easier to understand replication<br />
* External projects that try to help are often less maintained/robust/documented than core<br />
* Same thing is true for pooling projects<br />
5. Better pg_upgrade<br />
* More trustworthy<br />
* Handle version upgrades across large clusters<br />
* Deliver on <5 minutes promise. Can take a long time for statistics ANALYZE. Needs to save/restore that instead.<br />
6. MERGE UPSERT<br />
<br />
==== Enabling features to broaden userbase ====<br />
<br />
1. Finish JSON support<br />
* Most popular new feature on news sites LWN etc. since 9.0 replication<br />
* Some people want simple document storage like NoSQL, but with PostgreSQl reliability<br />
* Needs indexing performance improvements<br />
* More extract from JSON features<br />
* Schemaless PostgreSQL is possible with JSON or hstore, but it's not obvious that's true.<br />
2. Better extensions<br />
* Packaging for popular extensions on popular <br />
* Extensions should follow replication; move .so to standby? Lots of resistance to that idea.<br />
* Better visibility of extensions, and extension aggregators like PGXN.<br />
3. Client language queries<br />
* Straight from, say, Python to a parse tree<br />
* SQL Server/.Net does move in this direction for C#<br />
* Competition here is the non-relational databases<br />
4. Built-in sharding<br />
* PL/Proxy: must find it, minimal docs, questions around support situation<br />
* Target user base here doesn't like SQL or functions much either<br />
* Base on writable FDW?<br />
* Borrow ideas from notable sharded PostgreSQL deployments?<br />
<br />
==== Enhancements of FDW in 9.3 ====<br />
<br />
What do we need for FDW in 9.3? Want discussion of what to implement. Hanada is working on pgsql_fdw. Wants this in the core distribution, to replace dblink. Currently FDWs are read-only so users still need dblink. There is a list of features Hanada wants to implement. <br />
<br />
One issue is naming. Currently we already have postgresql_fdw in core, which is used by dblink. Proposed pgsql_fdw, but that doesn't fit our naming conventions. We should maybe rename the dblink one to dblink_fdw. There is also an issue around options where it should consult libpq on what options are supported. Since the function name conflicts are internal, this would only mess with pgupgrade. <br />
<br />
Features include:<br />
* writeable FDWs<br />
* aggregation pushdown<br />
* table sorting pushdown<br />
* table inheritance with FDW<br />
* constraint support on foreign tables<br />
<br />
Writeable FDW is the most interesting feature. One issue is transaction control, suggestion is that it's the responsibility of the FD module to control transactions, not PostgreSQL. Two ways to do it: one is that every write to a FT is an autocommit transaction. The other option is that the FT commits when you commit your local transaction. SQL Server automatically does two-phase commit. But it might be better for a first version not to have any transaction control. <br />
<br />
We will implement with no remote transaction control for the version for 9.3. Plus distributed transactions have lots of interesting failure conditions.<br />
<br />
KaiGai plans to get pgsql_fdw into the first CF so that we can play with it.<br />
<br />
=== Parallel Query ===<br />
<br />
Everyone run screaming from the room. First, understand that not everyone is I/O bound. There are cases where the system is primarily memory or CPU-constrained. If you have a handful of very complex queries which are primarily memory-bound, but we're not always I/O constrained, we need to look at ways to parallelize memory/CPU-constrained systems. We need to start looking incrementally with how we can do some things in parallel. <br />
<br />
Already-completed parallel pg_dump is an example of this. We need more cases where we can surgically parallelize stuff. Josh brought up the issue of PostGIS queries which need CPU parallelism. Greg brought up 48-core server with 256GB of RAM for a 100GB server. If we can get 4 CPUs, we get better memory bandwidth. We're sometimes memory-bound because of non-sharable memory bandwidth. Bruce told story of Informix 6's parallelism disaster.<br />
<br />
We need a task list of individual tasks we could parallelize instead of parallelizing everything. We do need a general "helper process" infrastructure so that we can hand work off to them. Simon is working on the parallel worker tasks now. <br />
<br />
Bruce and Greg discussed Greenplum's history. The way we generate query plans makes this hard, since it's kind of a "pull" basis: "gimme a tuple". If our query plan was a task list it would be easier. MPP systems have plans where they look at which steps can be parallelized and what they cost.<br />
<br />
The hard stuff is in the optimizer. Creating a cost model is really difficult. Peter brought up the Intel threading building blocks as a generalized parallelism case with a graph dependency. It has this thing called "task stealing". The classic parallelism case is video rendering, but our tasks are not like that. We need one-off cases for each task. <br />
<br />
It's like the Windows port in terms of scope and complexity. This is different from the Windows port, in that we can do it piecemeal, but we need to decide to go down the road of additional complexity. Dimitri suggested exposing the executor as a virtual machine. A lot of stuff is different. Josh suggested starting with parallel index build as the easiest single task with solid benefit. Bruce points out the even simpler case is to build several indexes in parallel over the same scan.<br />
<br />
Additional items that can be parallelized:<br />
<br />
* Redo<br />
* Vacuum<br />
* Logical dump<br />
* Sorting<br />
* Scans, particularly partition table scans<br />
<br />
=== MERGE ===<br />
Peter hasn't done as much with this as he expected so far, but plans to get something done for 9.3. What's the best way to solve this problem? Josh spoke about the need for atomic UPSERT, Peter agrees that that's a good version 1 goal.<br />
<br />
There's a fair amount of speculation on how to implement this feature. A lot of people want to use predicate locking, but we need an accessible API and some more features for predicate locking to make it work. We could also have a new kind of lock associated with an index tuple. The UPSERT case requires solving the hard problem, general MERGE beyond that is detail work. One thing we need to do is finish deprecating user-definable RULEs. <br />
<br />
Greg worked with a GSOC project for MERGE, but concurrency completely didn't work. We still have to solve the concurrency issues. Robert remembers that there were intrinsically complex issues without even a possible perfect solution. We need to look at the thread where we looked at the problems; the definition of sensible behavior is in question (thread: http://archives.postgresql.org/message-id/AANLkTineR-rDFWENeddLg=GrkT+epMHk2j9X0YqpiTY8@mail.gmail.com). We need to define the spec first. We can look at what other databases do.<br />
<br />
We can allow weird things to happen — corner cases — with MERGE or UPSERT. We can tell people to use SSI to avoid those weird issues. The SQL standard's MERGE doesn't really give us UPSERT, we should use different syntax. We want INSERT... ON DUPLICATE KEY UPDATE, not REPLACE INTO. We should ask MySQL folks about the history of this.<br />
<br />
Job #1 is building the simple case, UPSERT. We can do SQL-standard MERGE later. Greg wants reviewers to commit for this. This is really a Heikki thing. The Executor part needs expert review (Tom?).<br />
<br />
=== Materialized Views ===<br />
What's the minimum committable patch, and what direction should we take it in? Kevin has time to work on it, but it's been hard to schedule that time. <br />
<br />
* syntax for create/alter<br />
* new relkind in pg_class<br />
* pg_dump and restore support<br />
* being able to index them<br />
* statement to regenerate contents of matview (concurrently?)<br />
<br />
Will have an option to create a matview without filling it with data. pg_dump would use this. Would deal with the various ways of updating matviews, like incremental, later. If you wanted incremental updates on a matview which is too complex it would error. Further down, doing incremental updates via queueing mechanism. <br />
<br />
Also, there's the optimizer — substituting matviews for base tables automatically. That would be much later. Josh mentioned that someone had already written code for that. KaiGai asked about SE-Postgres and matviews, and discussed it with Kevin. Josh also asked about eventually doing on-request refresh.<br />
<br />
Simon wants us to call it something different from Materialized Views, becuase we won't have the optimizer stuff which Oracle does. Kevin is calling it declarative materialized views. And it's not clear that we want to handle query rewrite the same way Oracle does. We can have synchronous update of matviews, but more useful is queueing updates of the views to that they are "eventually consistent". Kevin talked about cranky judges.<br />
<br />
Phase I is just do do the object type and manual refresh. Incremental update will be later. There's a couple other things you can do if you can guarantee that the matview data would produce the same result. There was discussion around what to call the feature given that we'll be implementing matviews in several releases. <br />
<br />
Dimitri suggested that we could use matviews as a working concept for correlation stats. Simon discussed issues of setting acceptable staleness at data request time, both for matviews and for replication.<br />
<br />
=== In place upgrades & Checksums ===<br />
<br />
Where had the page format discussion gone wrong in the past? There's 4 issues:<br />
<br />
* adding more bytes in the header<br />
* having multiple page views<br />
* time required to upgrade<br />
<br />
The whole discussion talked about 32-bit checksums. But with 16-bit checksums, we could borrow pg_tli, and add a checksummed bit. Greg said we bump the page format, Robert said no. Greg wants us to "get practice" in having new page formats. We need to flag whether or not the page is checksummed. Will we ever need 32-bit checksums? If we implement 16-bit, we'll find out. <br />
<br />
Simon analyzed the error rate with 16-bit checksums, and felt that it was enough for an 8K page, but not a 32K page. Not clear on why it makes a difference on what the size of the page is. Plus we're not expecting an error on just one page.<br />
<br />
What are we planning to include in the checksum? What are we going to checksum? Jeff has been looking at issues where whole disk blocks are getting swapped. Suggested including the relfilenode etc. in the checksum in order to make sure that the page is where it's supposed to be. Would it prevent us from moving data around? Changing tablespaces, etc. might be an issue. Is table OID better or worse than relfilenode? Discussion of what pg_upgrade does. OID seems better.<br />
<br />
Need to have some way to track what's checksummed and what's not in a table. Each page will have a checksum bit. Add command VACUUM CHECKSUM ON. And we don't really have to implement an "old page reader". <br />
<br />
Hint bits are the biggest implementation issue. Simon's approach was to full-page-write all pages with hint bits once per checkpoint cycle, but there's still some stuff to be worked out there. There's an issue with hint bits being set while the page is being written by another process. Discussed the performance impact of this. <br />
<br />
For first version, we need to look at whether it's reliable. That is more important that the performance. Bulk loading has a major performance issue. Setting hint bits on the first select of a major table generates a whole bunch of WAL traffic.<br />
<br />
=== Partitioning and Segment Exclusion ===<br />
Current partitioning is "just good enough" to deter building something better. Dimitri has been thinking about what do to instead. Three problems:<br />
<br />
# when do you create the new partitions<br />
# constraint exclusion has all kinds of issues<br />
# index and constraints — no primary keys etc.<br />
<br />
We've had several proposals. Declarative partitioning syntax. But as long as we have separate tables, we only solve problem 2. We've had 5 years of partial patches for that problem.<br />
<br />
So how about another idea: the problem is having a table with a huge data set, and addressing only part of that table. We already have table segments -- we could have segments which are determined by ordering. The idea is to have an index which, given the partitioning key, would tell us where the tuples are located — in which segment. <br />
<br />
At what level in the system should a partition exist? Simon pushed for above-table level. Now we're looking at below table level. So the system defines partitions, not the user. We can look at a large table of 100 segements as having 100 partitions. If we store metadata about each partition, we can look at that to decide which segments to scan. Josh pointed out that this doesn't solve all or even most of the issues which partitioning is intended to solve. This solution is really a heavily compressed index or a performance optimization for scanning large, time-based tables. It's a sort of lossy index.<br />
<br />
Don't get hung up on 1GB segments, we might change that in the future. Or we could change that for this. Jeff Davis pictured something different for constraint exclusion with something simpler. Discussion about index scans, which may not be as efficient as it could be. Index-Only scan needs some optimizations.<br />
<br />
=== CommitFest Schedule ===<br />
<br />
Simon proposed a schedule, which includes the last commitfest being 2 months. Robert would like it to be shorter, not longer. Robert pointed out that the final CFs have been getting longer, not shorter since 9.0. Two issues related to commitfests:<br />
<br />
* works better when lots of people volunteer to review<br />
* last commitfest doesn't end.<br />
<br />
We would all benefit if we ended the CF earlier. Robert thought we should make CF4 shorter, non longer. Josh suggested that we could relese every 6 months. Big problem is people writing patchs still during CF4. We wait until everyone is exhausted and then decide what to bounce. We should make decisions at the beginning of the commitfest. <br />
<br />
Suggested separating review and commitfest. We should triage at the beginning of the commitfest. Robert brought up Dimitri's patches as an example. Robert wants completion over priority, Simon says the opposite. The problem with a consensus process is that there's no consensus. We could have a release manager. It's the big patches which are the real problem, since people really want them and there's lots of stuff in them. <br />
<br />
The problem with prioritization is that we're promoting a big feature over what's not quite there vs. several other patches which are ready. It's not fair to our contributors. But we could triage at the beginning because we're arbitrarily bumping stuff anyway. It's better to do it early than late. You can identify which patches are big or small, and which ones have a certain degree of readiness. Even if you're not correct, it'll help people allocate their time.<br />
<br />
For voting on priorities, we could vote and rate which ones are going to be easy or hard and how important they are for us. Dimitri outlined a system of point allocation and voting. Or we could list the committer on a patch at the beginning of the commitfest. That makes sense for the big patches, but not the small ones. So we should identify them at the beginning of the commitfest.<br />
<br />
Everyone is going to argue for their own stuff, though. People have different priorities. We also can't tell committers what to do, we can only ask. We'd like to get committer signoff early in the process. We might also want to sign off reviewers.<br />
<br />
Triage also needs to flag patches where we don't agree on the spec. <br />
<br />
We need to get better about giving feedback on the design for the patch. The problem with posting a design spec is that there's no formal review process for design spec. After CF3, a week of triage. If we haven't seen the big patch by the triage, it doesn't get into CF4 for big patches. <br />
<br />
Simon pointed out that it's hard to make rules for big patches because each one is different. <br />
<br />
So, changes to the process:<br />
* Planning week after the 3rd commitfest<br />
* "design spec" flagged submissions to the CF<br />
* write docs about the CF process<br />
* one patch, one review requirement<br />
<br />
=== CommitFest Management ===<br />
CF1: June 15 – July 15<br />
<br />
CF2: Sept 15 – Oct 15<br />
<br />
CF3: Nov 15 – Dec 15<br />
Planning Week – Dec 8-15<br />
<br />
CF4.1: Jan 15 – Feb 15<br />
Final Triage: Feb 1–7<br />
<br />
=== Goals, Priorities, and Resources for 9.3 ===<br />
<br />
Dave: Installers<br />
<br />
Andrew: Aggregation for JSON, Projecting data from JSON, Pretty-printing SQL, PL/perl binary format, binary output for psql, windows builds for extensions.<br />
<br />
Peter: UPSERT, trying to replace Flex, pg_stat_statements for query plans.<br />
<br />
Simon: Bi-Directional Replication<br />
<br />
Hanada: pgsql_fdw, other FDWs.<br />
<br />
Hitoshi: plv8, JSON support, some windowing function improvements.<br />
<br />
Kevin: Declarative materialized views, SSI performance.<br />
<br />
Jeff: statistics for ranges, range keys, range FKs, and range joins.<br />
<br />
Robert: performance, performance, performance. Reducing latency events. Write performance improvements. Can we optimize vacuum some more, reviewing patches.<br />
<br />
Josh: documentation, advocacy, maybe autoconfiguration. Release notes. <br />
<br />
Magnus: configuration directories. Monitoring. Simplifying replication.<br />
<br />
Dimitri: now working on "event triggers". Next step for extensions. Segment exclusion. Queueing in core design spec.<br />
<br />
Tom: backfilling weak spots in the planner.<br />
<br />
Alvaro: finalize FK locks. Allowing ALTER TABLE to reorder columns. <br />
<br />
Bruce: design spec for some parallel operations.<br />
<br />
Oleg & Teodor: improve SP-GiST. Indexing similarity. Also want to work on spatial join. JSON indexing if they can get sponsorship.<br />
<br />
Noah: global temp tables, local XID space for temp tables, more ALTER TABLE improvements.<br />
<br />
Greg: reviving dead projects: config directory, eliminate recovery.conf, adding instrumentation for timing events inside the database.<br />
<br />
KaiGai: SE row-level access control. <br />
<br />
Stephen Frost: list optimization work. SSL under Windows, supporting engines. <br />
<br />
=== Other Business ===<br />
<br />
Josh will write as-we-go release notes for alphas or whatever.<br />
<br />
We could have a mini-developer meeting in Prague. There was discussion about whether we should move the developer meeting around every year. This is the "main" developer meeting, but we could have another one somewhere else. We could have it at FOSDEM, in February.<br />
<br />
Josh brought up the idea of having an unconference day for Postgres contributors. Robert suggested interest group meetings as a refinement of that.<br />
<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:PostgreSQL 9.3]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2013_Developer_Meeting&diff=37564PgCon 2013 Developer Meeting2023-02-10T08:43:05Z<p>Alvherre: </p>
<hr />
<div>A meeting of the most active PostgreSQL developers is being planned for Wednesday 22nd May, 2013 near the University of Ottawa, prior to pgCon 2013. In order to keep the numbers manageable, this meeting is '''by invitation only'''. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org). <br />
<br />
Please note that this year the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.3 release cycle. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies, unlike in previous years.<br />
<br />
This is a PostgreSQL Community event. Room and refreshments/food sponsored by EnterpriseDB. Other companies sponsored attendance for their developers.<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be from 8:30AM to 5PM, and will be in the "Red Experience" room at:<br />
<br />
Novotel Ottawa<br />
33 Nicholas Street<br />
Ottawa<br />
Ontario<br />
K1N 9M7<br />
<br />
Food and drink will be provided throughout the day, including breakfast from 8AM.<br />
<br />
[http://maps.google.ca/maps?f=q&source=s_q&hl=en&geocode=&q=novotel+ottawa&aq=&sll=49.891235,-97.15369&sspn=36.237851,79.013672&ie=UTF8&hq=novotel+ottawa&hnear=&ll=45.421528,-75.683699&spn=0.036869,0.077162&z=14&iwloc=A&layer=c&cbll=45.425741,-75.689638&panoid=Z4FUGnkZkdHAOkIxyjjS9Q&cbp=12,25.83,,0,-0.6 View on Google Maps]<br />
<br />
== Attendees ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname):<br />
<br />
* Josh Berkus (secretary)<br />
* Jeff Davis<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* Dimitri Fontaine<br />
* Andres Freund<br />
* Stephen Frost<br />
* Peter Geoghegan<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* KaiGai Kohei<br />
* Alexander Korotkov<br />
* Tom Lane<br />
* Fujii Masao<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Dave Page (chair)<br />
* Simon Riggs<br />
<br />
== Proposed Agenda Items ==<br />
<br />
Please list proposed agenda items here:<br />
<br />
* 9.4 Commitfest schedule and Commitfest tools.<br />
* [http://wiki.postgresql.org/wiki/Parallel_Query_Execution Parallel Query Execution] (Bruce, Noah)<br />
* logical changeset generation review & integration (Andres)<br />
* utilization of upcoming non-volatile RAM device (Kaigai)<br />
* pluggable plan/exec nodes (Kaigai)<br />
** to offload targetlist calculation, sorting, aggregates, ...<br />
* [[GIN generalization]] (Alexander)<br />
* An Extensibility Roadmap (dim) (http://pgsql.tapoueh.org/temp/extensibility.pdf) (15 min)<br />
* Representing severity - derive severity from SQLSTATE (Peter Geoghegan - see http://www.postgresql.org/message-id/CA+TgmoZEjq7va+SfDZQwk6E4emEWThENNyxfqEGhB3iuoT1OJw@mail.gmail.com) (10 min)<br />
* Error logging infrastructure - store normalized statistics about errors in a circular buffer (Peter Geoghegan). Arguably this could be discussed alongside SQLSTATE item. (10 min)<br />
* Failback with backup (Fujii Masao - related discussion is: http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtbJgWrFu513s+Q@mail.gmail.com)<br />
* Volume Management (Stephen Frost - wiki page will be forthcoming before the meeting)<br />
* AXLE Project - Big data analytics for Postgres (Simon Riggs) - an overview of the feature plan, how project works and what community can expect (15 min)<br />
* Incremental maintenance of materialized views (Kevin) - differential REFRESH and infrastructure for ''counting'' algorithm (30 min)<br />
<br />
== Agenda ==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|08:00<br />
|Breakfast<br />
|<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|08:30 - 08:45<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|-<br />
|08:45 - 09:45<br />
|Parallel Query Execution<br />
|Bruce/Noah<br />
<br />
|-<br />
|09:45 - 10:15<br />
|Pluggable plan/exec nodes<br />
|KaiGai<br />
<br />
|-<br />
|10:15 - 10:30<br />
|Volume Management<br />
|Stephen Frost<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 10:45<br />
|Coffee break<br />
|<br />
<br />
|-<br />
|10:45 - 11:00<br />
|Utilization of upcoming non-volatile RAM devices<br />
|KaiGai<br />
<br />
|-<br />
|11:00 - 11:30<br />
|Logical changeset generation review & integration<br />
|Andres<br />
<br />
|-<br />
|11:30 - 11:40<br />
|Representing severity<br />
|Peter G.<br />
<br />
|-<br />
|11:40 - 11:50<br />
|Error logging infrastructure<br />
|Peter G.<br />
<br />
|-<br />
|11:50 - 12:30<br />
|Incremental maintenance of materialized views<br />
|Kevin<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:30 - 13:30<br />
|Lunch <br />
|<br />
<br />
|-<br />
|13:30 - 14:15<br />
|GIN generalization<br />
|Alexander<br />
<br />
|-<br />
|14:15 - 14:30<br />
|An Extensibility Roadmap<br />
|Dimitri<br />
<br />
|-<br />
|14:30 - 15:00<br />
|Failback with backup<br />
|Fujii<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:00 - 15:15<br />
|Tea break<br />
|<br />
<br />
|-<br />
|15:15 - 15:45<br />
|9.4 Commitfest schedule and tools<br />
|Josh<br />
<br />
|-<br />
|15:45 - 16:45<br />
|Goals, priorities, and resources for 9.4<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|16:45 - 17:00<br />
|Any other business/group photo<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
| <br />
|}<br />
<br />
== Notes ==<br />
<br />
Attending:<br />
<br />
* Dave Page, EnterpriseDB<br />
* Andres Freund, 2ndQuadrant<br />
* Kevin Grittner, EnterpriseDB<br />
* Dimitri Fontaine, 2ndQuadrant<br />
* Andrew Dunstan, PostgreSQL Experts<br />
* Noah Misch, EnterpriseDB<br />
* Bruce Momjian, EnterpriseDB<br />
* Fujii Masao, NTT Data<br />
* Tom Lane, Salesforce<br />
* Magnus Hagander, Redpill Linpro<br />
* Robert Haas, EnterpriseDB<br />
* Josh Berkus, PostgreSQL Experts<br />
* Kaigai Kohei, NEC<br />
* Jeff Davis, Teradata<br />
* Alexander Korotkov, MEPhI<br />
* Peter Geoghegan, Heroku<br />
* Peter Eisentraut, Meetme<br />
* Stephen Frost<br />
<br />
=== Parallelism ===<br />
<br />
Bruce Momjian is looking at where Postgres is and hardware changes, and it's time to look at parallelism. Unlike the Windows port and pgUpgrade, there's no clear "done" with Parallelism. We're going to have to do a lot of small things, but not one big feature. Concern anout code cleanliness and stability. What is going to have to happen is that we'll attack one small thing, and build the infrastructure for parallelism.<br />
<br />
Robert Haas is talking about EnterpriseDB's commitment to parallelism. The two things EDB wants is materialized views and parallel query. The way we're approaching this is the same way as 2Q approached logical replication for the last release cycle. We're doing this as a company, and we have buy-in from our management. So far there's a wiki page on parallel sort and Noah's posted some stuff to pgsql-hackers. The first part is to get a credible worker system in place, and then we can tackle parallelising particular things.<br />
<br />
Stephen Frost pointed out that users are currently ad-hoc implementing parallelism in their middleware code. Bruce said that there was a basic set of steps for all parallel tasks. There's a false sense that threads automatically give you infrastructure for parallelism. Bruce doesn't think that's true. Having the worker/marshalling stuff sprinkles all over the code would be really bad, so we want central infrastructure.<br />
<br />
Jeff Davis pointed out that there were different approaches to parallelism. One is "cluster parallelism". Do we know what approaches were taking? Cluster parallelism involves making the parallel tasks according to data partitions. It's popular in data warehousing. Robert Haas doesn't expect to get that far in one release cycle.<br />
<br />
Haas: People come up with great ideas for PostgreSQL, and they do two things: either they figure out how to do it without modifing the query planner, or they fail. So we looked at index building, which wouldn't require dealing with the query planner. But the general problem of parallel query planning, we have to solve harder problems. I don't want to get bogged down in those sorts of questions at the outset, because there's a bunch of stuff to get done to execute parallel jobs in general.<br />
<br />
Josh Berkus suggested implementing a framework for parallel function execution because then users could implement parallel code for themselves. It would help the Geo folks. Noah thinks this is possible today, but isn't specific how. Tom argues against exposing it to users in early iterations because the API will change.<br />
<br />
There's a few things you need:<br />
* and efficient way for passing data to the parallel backends, probably using a shared memory facility, because sockets are too slow.<br />
* some logic for starting and stopping worker processes. Custom background workers aren't quite what we need for this. Also different from Autovacuum, which is a bit kludgy.<br />
* you need to be able to do stuff in the worker processes as if they were the parent process. They need to share the parent worker's state, and there are a lot of state things which are not shared. If the master takes new snapshots or acquires extra XIDs, not sure how to share that. Some things will need to be prohibited in parallel mode. Threads don't solve this. Syscache lookups are also a problem, but we need them.<br />
<br />
Noah wants to target parallel sort, specifically parallel memory sort. This hits a lot of the areas we need to tackle to make parallelism work in general. We need a cost model as well. How are we going to mark the functions which are safe to run in a parallel worker. We don't want to just call functions *_parallel because that will change. Maybe there will be an internal column in pgproc, as a short-term solution.<br />
<br />
Peter E. asked about timeline. For 9.4, we want to at least have an index build which runs a user-specified amount of parallelism. It needs to be reasonably fast.<br />
<br />
Peter G. asked about having a cost model for parallelism. Right now we don't have costing for how long it takes to sort things based on the number of rows. Sorting a text column in bad collation can be 1000X as expensive as sorting integers, for example. We might pick a single operator and make that the cost reference operator. Perfect costing isn't possible, but we can do some approximates. The initial operations we choose for parallelism will be very long operations. Startup costs are too high otherwise. We're not going to parallelize something that's 200ms. Something that takes 10s or a minute or a couple minutes.<br />
<br />
Haas thinks that a lot of people will be appalled for starting up a parallel worker. That can be optimized later. It's OK for the initial version to be unoptimized. Even if it takes a full second to start up a new backend, there are sorting tasks which take large numbers of seconds. Those are existing issues which we'll hammer on as we get into this space; we may fix starting up a new connection speed in the process.<br />
<br />
Josh pointed out that taking a hour to build an index, it's probably an external sort. Noah posted a patch to allow larger internal sorts, over 1GB. Andrew pointed out that a process model would tie us to certain large operations. Threads would add a lot of overhead to everything, though. We'd have to rewrite palloc. Haas things we can get the minimum unit down to something fairly small. Andrew pointed out that on windows process creation is very expensive. Haas doesn't want rewrite the entire internal infrastructure.<br />
<br />
With Threads, everything is shared by default, with processes, everything is unshared by default. The process model and explicit sharing is a shorter path from where we are currently. Parallelism helps with CPU-bound processes, but not IO. Josh argued with Kevin that there are some types of storage where this isn't true. Kevin just pointed out that if the resource you're using the most of isn't bottlenecked, then it's not helpful to parallelize. Haas pointed out that parallelizing seq scan on a single rotating disk won't help, as opposed to parallelizing scan from memory, which would be much faster. Our cost model isn't up to this; we might even have to have a recheck model where the executor notices things are slow and switches approaches.<br />
<br />
Bruce pointed out how Informix switched to threads between 5 and 7 and it killed the database. Parallelism will take Postgres into new markets.<br />
<br />
Andrew pointed out that prefork backends will help us form new connections if we can get it to work. Haas pointed out that we're going to have to cut nonessential issues to avoid taking forever.<br />
<br />
=== Pluggable plan/exec nodes ===<br />
<br />
Kaigai is working on GPU execution. When he worked on writable FDW, pseudo-column approach for foreign scan node returning an already computed value, but that was rejected, because the scan plan needs to return the data structure as its definition. So Kaigai wants to add an API to add a plan node to the exeuction node, allowing executor to run extension code during query execution. When plan tree tries to scan large table with sequential scan, and the target list has a complex calculation, we can have a pseudo-column which does this calculation on a GPU.<br />
<br />
Kaigai is talking about planner and executor. Haas doesn't understand how we would have pluggable planner nodes, as opposed to executor nodes. How would you allow it generate completely new types of plan nodes? We can replace existing plan nodes, but new types of nodes would require a new extensibility infrastructure. To do this, we need two new infrastructures to inject plan nodes and executor nodes. But Kaigai is mainly focused on is replacing existing scans and sort nodes. He didn't investigate the difficulty on planner extension yet.<br />
<br />
Peter E. pointed out that 65% of the work will be the ability to add new nodes at all. Replacement will be MUCH easier. However, the ability to add new nodes would be very useful to PostgreSQL in general. Tom thinks that it could be done. Haas pointed out that we have a lot of wierd optimizations about what plan node connects to which other plan node. Tom doesn't think that we have that many. Noah says we'll probably use a hook.<br />
<br />
For a new executor node we have a dispatch table, it's easy. Plan nodes could use a jump table too. Right now we have function volatility markers; for nodes we'll need the same thing. But that's a problem only for expression nodes.<br />
<br />
This was discussed in the cluster meeting. PostgresXC wanted pluggable nodes for cluster scan, as do some other people. So a general pluggability infrastructure would be good. If we have pluggable scan nodes, we can plug in cluster scan as well as GPU scan.<br />
<br />
Jeff Davis pointed out that range join could be our first pluggable node. Haas pointed out that opclass support requirements might make it difficult; there are easier cases. Range join might need to be hardcoded. Pluggable planner stuff is hard.<br />
<br />
This would also maybe get people who fork Postgres to stay closer to the core project and implement extensions instead of having an incompatible fork which then doesn't work with others.<br />
<br />
=== Volume Management ===<br />
<br />
Right now we have tablespaces. Having some more automation around using them would help. Like we want the indexes on a separate tablespace from the heap; there ought to be automation for this. Somebody hacked up something like this ... maybe Depesz, in 2007.<br />
<br />
Haas asked if having indexes on a separate volume was actually faster. Frost asserted that it was. Josh brought up that with new small fast storage there's reasons to want stuff to move around again. Also, index-only scans. If I only have one column, then I can do index-only scans, so I want to put the index on faster storage. Josh pointed out that indexes-separate worked back when at OSDL.<br />
<br />
Stephen Frost pointed out that they have pairs of drives, with a whole lot of pairs. Stephen asked about whether or not we'll ever have things like Oracle Volumes. Kevin said that that configuration works on raw devices, but not so much on complex filesystems. FRost says that for specific workloads, it really works to parallelize everything for massive joins.<br />
<br />
Several people asserted that modern RAID is fairly efficient. Josh asked if any default automated rules would work for a general class.<br />
<br />
Frost explained Oracle Volumes. They can be used to go to raw drives. Volumes are disks or drives or files. You can have multiple volumes under a single tablespace, and Oracle will manage them together. Do we want to do that? Maybe we should just use LVM.<br />
<br />
There's also some other things we could do with volumes, like compressed volumes. Noah has seen tablespaces abused 5 times as much as used properly. We should be careful that what we're adding is really useful. People want things like encrypted & compressed tablespaces. Every time something like this comes up, Tom or someone says "use a filesystem which provides that." There are some advantages to having the database do that, but there's a lot of development effort.<br />
<br />
Noah suggested that event triggers would do this. Frost says that they already have code, they want to simplify it. Josh points out that there aren't simple rules for this; most DWs don't have rules which are as simple as "indexes go here, tables go there". A lot of this is misplaced Oracle knowledge. Josh brought up the backup issue again.<br />
<br />
=== Utilization of upcoming non-volatile RAM devices ===<br />
<br />
Kagai is just presenting an idea and asking for opinions. Wants to consider the impact of these new devices. Wants to boost the performance to write the transaction logs. Once we have mmaped to the nvram device, then we can consider that completion of log writing. Non-volatile ram will be exposed as main memory. There are Linux patches which allows mmaping to these devices. Once we have these devices, it dramatically reduces the overhead to write the transaction log.<br />
<br />
Kaigai said that this topic comes from the forks on ram storage devices. We wanted improvement of row-writing performance. Also appliable for general fast SSD storage. Wants to add fork around row-writing and create extensions where we maybe have nvram and mmapped region for the xlog.<br />
<br />
Josh points out that this dovetails with transactional memory, which allows lock exchange for multiple K of data and multiple instruction. Like we might not need as much from the WAL. Others disputed that.<br />
<br />
Kaigai wants to implement this for the write-ahead log. Then later we can look at the heap. NVRam has limited write cycles, so we don't want to use it for main storage. <br />
<br />
=== Logical changeset generation review & integration ===<br />
<br />
Andres gave a progress report on current status of the changeset generation patch. It's mostly done, but we need to integrate it and it needs substantial review. He needs someone to help, not from 2Q -- someone independant. How can he proceed to get the pieces into core?<br />
<br />
Haas suggested that they could swap for review on parallel query. The patch is currently about 12,000 lines. Peter G says he can take a shot at it, but he wants someone else.<br />
<br />
Frost asked how much was additional stuff there is vs. changes to existing behavior. There's new kinds of TOAST tuples. There's mapping relfilenode back to the OID, which is controversial. Andres says that it's safe the way they do it, and if it's broken in logical replication then it's already broken in Hot Standby, so we should fix it. There's some changes to heapam.c, like preventing full page writes from removing the tuple contents.<br />
<br />
Haas points out that Heikki objects to most of the functionality in core. Andres says that he can reduce complexity in some areas. Noah says that having zero implications for nonusers it can't possibly work. If it's a good thing, we're going to have to accept some code complexity cost to do it.<br />
<br />
The write-ahead log currently is designed to do crash recovery. We've pushed that for hot standby and PITR, but this is pushing it even further, we're adding another level of pain. TOAST tuples are a good example of this; how we crash recover them doesn't work for logical replication. To what extent do we use the WAL format we have now, and how much do we modify it? What's the performance cost?<br />
<br />
Some things don't need much intervention. But we have to look up stuff in the catalogs, and do catalog time travel to figure out what the object references mean. Peter E asked about simply creating a completely separate log for logical replication. Andres says that the tried this, but it had a huge extra cost, because you have to add a whole bunch of extra data to it. That would double synchronous writes, at least. So are we not going to investigate this? The logical replication stream would need to include catalog updates.<br />
<br />
Haas points out that the kludges for this aren't as bad as hot standby. Magnus points out that MySQL has to do 2PC between the binlog and the innodb log. But a lot of people are going to want this feature.<br />
<br />
Steve Singer has written 80% of the code for basing Slony on LCR. 2Q is also working on their own replicated solution for this.<br />
<br />
So the other modification we need to deal with xmin for pg_catalog and user tables differently. But this had some benefits for general Postgres efficiency. Can we split this up at all? The earlier patches can be reviewed separately.<br />
<br />
One other thing we devised for this is logical replication slots. You need to know where readers last left off. We could maybe use this for binary replication as well. You can decode on a hot standby, not just on a master. Primary key updates are handled. We only have the TOAST tuples if they change, though. We have before images, but they have some limitations. Kevin asked about rolling backwards to an earlier state.<br />
<br />
Haas will review some of the patch, but not the whole thing. We had the same issue with streaming replication, not enough reviewers. Haas would like to get it done early in the first commitfest. Andres will submit but is not sure we can commit it in the first round.<br />
<br />
=== Representing Severity ===<br />
<br />
Peter G. agitated about that there's no principled way to discover whether a "severe" error occurs, such as corruption from errors in xlog.c. There's no way that DBAs can confirm that an error of particular severity has occured, except grepping for particular strings. There's multiple distinctions here. Such as there are various "can't happen" errors which our developers want to see as soon as possible. Heroku also wants to bring errors to the attention of the lists automatically. The tail-n-mail approach is not scalable.<br />
<br />
Haas's object is whether it's possible to agree on categorization about whether or not something is severe. We have hundreds of ereports, so we need a very simple categorization scheme which everyone can agree on. Tom points out that "can't happen" and "severe" aren't necessarily the same thing. Maybe we could use SQLSTATE and tweak some things. Everyone thinks this is the way to go.<br />
<br />
It would be very useful to be able to filter the volume of logging too. We need to log sqlstate though. We might in addition put the severity concept into the log. We'd have to clean up how we use SQLSTATE too. We could supply some default filters, but users would need to tweak them. Filtering the log is a separate feature from fixing the SQLSTATE. That can be done later.<br />
<br />
=== Error Logging Infrastructure ===<br />
<br />
Peter G: As a complimentary feature, it would be great to be able to store errors in a circular buffer available, like pg_stat_statements. Peter E is doing a presentation related to this tommorrow using logging hooks. You can already do this.<br />
<br />
Peter E argues against doing more aggregated views like pg_stat_statements, because it doesn't provide historical or granular data. With logging hooks, you can build this more granularly. It already exists in 9.2. The average DBA can't write code in C, logging hooks allow you to work around this, you can get log data in JSON. Peter E argued against having a hardcoded system for this.<br />
<br />
Josh pointed out the efficiency argument. Robert pointed out the usefulness of "last message repeated 351 times". Peter G envisaged doing pg_stat_errors as a contrib module. This could be build on logging hooks. Dimitri pointed out that you can also do this with FDWs and CSV logs. Some discussion about how does Oracle Enterprise Manager do this? PEM uses CSV logs.<br />
<br />
Discussion about alternative approaches ensued.<br />
<br />
Heroku can feed back some information about errors using log aggregation. Frost would love to have performance information from Heroku.<br />
<br />
=== Incremental Maintenance of Matviews ===<br />
<br />
Kevin: Matching other DBs who have matviews will require a 5 year plan. We have a foundation in 9.3, you can declare a view, and it's materialized. It's just a foundation at this point. For 9.4, we want the first level of incremental maintenance. We can't get all the optimizations in, we just want to make the infractructure for the simplest cases in there.<br />
<br />
For the first commitfest, I want a transactional REFRESH using the approach on the lists. There will always be cases where incremental maintenance won't work. So we want transactional refres first, so that we can analyze whether it pays to update using the delta. If every row is touched, that's a loser. So we want to fall back to the full refresh if the delta is too large.<br />
<br />
He won't speculate on what we can get into each subsequent release. Planning to use the counting algorithm, it's very well established. Will only handle the simple cases at first, and will work through complex cases and see how far they get.<br />
<br />
Need a new system column, count_t, which counts the number of ways a give tuple can be derived from source data. It's a kind of reference count. There's a completely different algo for recursive views. Handling NOT EXISTS is a separate implementation too.<br />
<br />
What he wants to get in these first releases is how to implement the count_t column, how to implement other stuff in the first commitfest. Then the other CFs can be refinements. Peter E points out that count could overflow. The new column will be added only for new matviews which are specified as "incremental". We'll probably have other stratgies for incremental update, and it'll only be used by the counting strategy. You'd have to do an ALTER table to change that.<br />
<br />
Thinking of doing the incremental change as synchronous in the transaction. Initially will not work for async maintenance. You need a delta with a count which can be plus or minus, you need a snapshot for before and after the change. You need to generate the deltas with each change to the base table, but you don't need to update the matview immediately. <br />
<br />
There was discussion about how this applies to expensive, infrequently updated materialized views.<br />
<br />
The extra column would only be present in incremental materialized views and the deltas. There would also need conditional code in execution nodes which know how to handle the column. Haas questioned if there was some way we could add a non-system column. How do the executor nodes know not to display the system column. It's not like SELinux because it wouldn't be present in every table.<br />
<br />
Maybe we need a general notion of invisible columns. That would be generally useful. Then we could have extra state information. Bucardo could use it. A lot of things want to hide columns from "select *". We should add this concept. <br />
<br />
There was discussion about having stable system views. This change will break things like the ODBC driver, because it gets the list of columns from pg_attribute. We can start with just matviews. But right now drivers assume that anything with a positive attnum is visible. But there's no patch for logical column numbers now. Alvaro has a patch for it, it's complete but it's bitrotted. This is just like attisdropped; it broke a lot of things. Alvaro might update it, he's not sure he'll have time. But we have to get this done in CF1.<br />
<br />
If the next version is 10.0, though, we could break things. So maybe it's a good time.<br />
<br />
=== Lunch Notes ===<br />
<br />
Robert suggested getting rid of the v2 protocol in the next version. He thinks there's hacks in COPY which slow it down and are required for support of v2. But people want to see numbers on that. v2 is also used for JDBC, but that's to fix the plan cache issue which is fixed in 9.2. We need to see concrete improvements, though, and then there'd be an easy switch. We also should know what drivers are using v2, like pygresql.<br />
<br />
=== GIN generalization ===<br />
<br />
Alexander talked about general advances in GIN. Many people use external full text search because it's faster. One of the reasons GIN is slower is that it's used only for filtering, forcing the executor to page all of the results, and ranking. And it doesn't do LIMIT.<br />
<br />
So how the extended indexing solution works: there's positional information, so that we can calculate relevance. So there's now a structure to add extra infromation to the vector for any additional information. We can extract positions of words. To keep the index small we're using run-length compression. The index is then about the same size as the uncompressed index, with a lot of extra information.<br />
<br />
The problem is that we use the infrastructure from KNN. GIN declares that it can do ORDER BY with an operator, so you can get ORDER BY and filtering from the same index. Noah asked what ordering operator he's talking about. GIN calculates distance on item vectors, and then sorts by an array, and then executes gin_get_tuple(), which is new. It can then return them one-by-one for sorting. It's kind of like KNN, but it works differently, but it's much cheaper than getting the tuple from the heap. We need to tell the planner that we can return the original tuple from the index.<br />
<br />
This will be a new operator class. And we modify the operator classes for tsvector.<br />
<br />
Peter E asked for applications outside tsearch. Alexander said that it would also work for similarity searches. Can also be used for regex index. You can build a regex, index it, and then search for text strings which match that regex.<br />
<br />
Andres pointed out an alternate method for doing this, but it requires improvements to the planner. Haas encouraged maybe taking this approach. If we assume that the planner can get an expression from the index, then we need sorting before we go into the heap. That's going to be hard. That's a real issue for any steps which happen between checking the index and the heap. You could see tuples which aren't visible. This might not be a problem for the GIN case, but it's liable to be a problem for the general case.<br />
<br />
Maybe this is related to bitmap indexes. Btree operators need to be able to deal with invisible tuples, so this isn't new.<br />
<br />
Discussion of implementation methods ensued. There was question about what doesn't work in KNN-Gist infrastructure. Like it doesn't work without a WHERE clause. But this could be generalized. This will break amproc, but it's not like anyone makes access methods outside of core. Maybe we should get rid of pg_am and just have a jump table, pg_am just isn't really useful. We can't drop it because the oids there are used as keys elsewhere, and there's really not much cost.<br />
<br />
Another infrastructure question is that currently GIN only has to know one data type. Now when we introduce additional information, we need to know datatypes. SP-GIst needs 3 datatypes. That's why we have spgist_config. For GIN we'll need a new configuration method to return this datatype. So there's a question of breaking older stuff. Can you support both opclasses? Storage parameter would define it, so it would appear in conflict.<br />
<br />
This should probably be GIN2 with new access methods, etc. We'll be changing the storage format of GIN indexes. But that would require supporting the original GIN indefinitely. We could bump the version number on the GIN node page. Two choices: write the code so that it supports the old format, or add a whole new access method. Are we changing the operators or the access methods? Changes are mostly in GIN, a lot less in the operator class. Peter E pointed out that the compression could be a separate feature. Alexander says that the the change is not easily separable because of some of the performance optimizations.<br />
<br />
There was more implementation discussion about the methods for doing online upgrade if we change GIN. Rewriting on selects is out. In general we want to deprecate the old GIN, but we'll need a couple versions to do it.<br />
<br />
Frost raised the idea of general infrastructure to support upgrade-in-place. Like we could have a flag in pg_class. But several people dismissed this idea as not really helpful.<br />
<br />
=== An Extensibility Roadmap ===<br />
<br />
See slides linked from this page.<br />
<br />
We've already did some extensibility with create extension and create event trigger. Now I need the extension templates. I want reviewers and committers to understand where I want to go.<br />
<br />
DDL Execution Plans is after that ... like EXPLAIN ALTER TABLE. Dimitri wants CREATE EXTENSION to fetch code from the internet. But core doesn't want that, so Dimitri wants to do that using extension templates and event triggers. Then he wants a PL/C language which compiles on-the-fly. This would allow creating extensions from source dynamically. But the source is in a place only root has access to.<br />
<br />
Dimitri wants extensions to follow replication. There's some security issues with that. There's issue with having such things like downloading in core and having it not suck. There's a lot of reliability issues. Haas isn't keen on the whole idea. But Dimitri doesn't want to code all of the extension features in C. But this is already a problem with external PLs.<br />
<br />
Dimitri wants to be able to build full set of extensions using only a PostgreSQL connection and superuser access. Josh talked about the developer-push issue. But there's security issues with arbitrary C code, so a lot of people have issues with it. Peter G suggested we could somehow sandbox the C code. Haas says that this is possible, but you need kernel-level help.<br />
<br />
Josh says 3 problems: developer push, relication, and backup. But what if we solved it for non-C cases, but not C cases? Some people thing it would be useful. Extension Templates for packaged user code would be very useful for in-house code, so it can be packaged as an extension.<br />
<br />
=== Failback with Backup ===<br />
<br />
Fujii hears complaints from users: why do we need to take a full snapshot for failback? If the database is very large, this can take a long time. Three problems: timeline ID mismatch, WAL record mismatch, database inconsistency.<br />
<br />
Timeline ID: we can't start the old master as a replica because of the timeline ID. This is resolved now in 9.3.<br />
<br />
WAL Record Inconsistency: if the master crashes after writing a WAL record but not sending it to the standby, then it has WAL records the standby doesn't have. We can resolve this by removing all WAL records in the old master before starting it up in standby mode. Old master will then try to start recovery, and them start replication to retrieve the WAL records. One problem: last checkpoint record may not be replicated to the standby. Could we make the checkpointer wait for replication? If the Standby goes offline checkpoint would hang.<br />
<br />
Frost asked how large of a user problem this is? People cited examples.<br />
<br />
Haas suggested a holdback for datafile sync. Magnus pointed out that if we distinguish between failover and switchover. So we could have failback only for switchover circumstances. Jeff pointed out that you have to break replication and resync standbys after upgrade. He also thinks that we can use the WAL to help a diff between datafiles. Controlled switchover would fix the upgrade case too maybe.<br />
<br />
We could create a list of dirty blocks from the WAL. But there are a lot of special cases which would prevent rollback. <br />
<br />
Database inconsistency can happen if the master crashes. There can be some missing database changes. One solution is implementing undo logic, but that's very very difficult. Fujii's approach would be to add some write points to the master. Before the master writes the database to the disk, it writes the WAL to disk. It could wait for replication of WAL before writing to the LSN. Except for hint bits. And what if the replica goes away?<br />
<br />
Fujii wants to solve both controlled switchover case as well as the crash & failover case. The controlled swithover case is a lot easier to solve. You could freeze out existing connections, finish writes, and then switch over. There's a lot of similar ideas in Oracle QS.<br />
<br />
Josh suggested that the crash case isn't any different if we're already in synch rep. Others disagreed. Haas suggested that we could use Andres' infrastructure of slots for replication. Noah suggested that we could delay hint bit application until we're after walflushlocation. Maybe we should only solve this for controlled switchover.<br />
<br />
Jeff: for checksums, we already store the first modified record after the checkpoint. We could store very abbreviated information the WAL for checkpoints. There would be some significant cost for hint bits on a seq scan. If you want to be able to turn the WAL log into an undo log. You only have to copy certain pages. You could log only the first change since the last checkpoint. <br />
<br />
=== 9.4 Commitfest schedule and tools ===<br />
<br />
The CF schedule will be the same as last year. This time do what we said we'd do, not what we actually did. So include the triage periods etc. Josh will run first CF.<br />
<br />
Discussion about patch development during commitfest. Ending a commitfest takes a lot of work. There's a lot of work for moving stuff along. We could make a commitment that -hackers will not argue with the CF manager. Josh also wants to have a commitfest assistant. He called for volunteers. The author ought to be the one to take the patch out, it shouldn't wait on the CFM. Each patch deserves one solid review per CF, but not more than one.<br />
<br />
Kevin points out that the biggest issue is finding reviewers and getting the reviewers to do the review. Josh asked if having reviewers put their name down is actually helpful. Maybe we need to be really aggressive about taking names off. There's a big difference between Noah doing a review and some new reviewer doing a review. Josh suggested clearing reviewer after 5 days. Frost suggested removing them earlier. Maybe more tightly tying them to the archives would be good.<br />
<br />
Lots of discussion ensued.<br />
<br />
We should tell people to not put their names down until they are ready to start reviewing. The 10K line patches are not the problem, the 10 line patches are.<br />
<br />
Josh discussed replacing the CF tool, either with Gerrit or Reviewboard or something new. Frost suggested Github. Josh discussed some issues for reviewers. Discussion ensued, but not recorded because the secretary was speaking. Josh also wants automated patch build. Other people suggested automated apply.<br />
<br />
Josh will develop a spec for a new tool. He will run it past the CF managers for comment. Then post it to -hackers for comment. Haas is skeptical about 3rd-party tools. He hates git stuff. Frost discussed using forks in github. We should be able to track a git branch in the tool.<br />
<br />
=== Goals, priorities, and resources for 9.4 ===<br />
<br />
* Dave hopes to start writing pgAdmin4 this year.<br />
* Kevin will work on materialized views.<br />
* Dimitri: Extension templates and event triggers.<br />
* Andrew: most JSON work is complete, the rest can be extensions. Wants to work on grouping sets.<br />
* Noah working on EDB features (see above), and bug fixes.<br />
* Bruce is working on pgPool to make it better and more usable, maintainable and debugged.<br />
* Fujii is working on PostgreSQL replication. Would also like to work on pg_trigram for Japanese.<br />
* Simon (via Skype) Tablesample patch, COPY tuning, and security reviews (SEPostgres).<br />
* Tom will take an interest in pluggable storage. Maybe also Autonomous transactions.<br />
* Magnus will work on a new CF tool. Also more improvements to pg_basebackup.<br />
* Haas will be working on side issues on matviews and parallel query as they come up.<br />
* Josh will be working on new CF tool. With time/funding will work on automated testing.<br />
* Kaigai will work on row-level security. Also GPU stuff.<br />
* Jeff will be working on convincing Robert to remove PD_ALL_VISIBLE. Also corruption detection. And Range JOIN, or inclusion constraints.<br />
* Alexander will work on new GIN.<br />
* Peter G will be working on error stuff (per above). Also wants to work on fixing archiving falling behind and panic shutdown. Also doing a little bit of work on upsert.<br />
* Andres will be doing logical replication.<br />
* Peter E has the transforms feature outstanding. And wants to move the documentation to XML. Wants to improve test coverage.<br />
* Frost will be doing pg infrastructure. Wants to help with CFs. Also fix some stuff with optimizer.<br />
* Greg Smith is working on autovacuum style I/O limits for other statements.<br />
<br />
=== New Committers ===<br />
<br />
The core team has selected the following new committers:<br />
<br />
* Jeff Davis<br />
* Fujii Masao<br />
* Stephen Frost<br />
* Noah Misch<br />
<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2014_Developer_Meeting&diff=37563PgCon 2014 Developer Meeting2023-02-10T08:43:00Z<p>Alvherre: </p>
<hr />
<div>A meeting of the most active PostgreSQL developers is being planned for Wednesday 21st May, 2014 near the University of Ottawa, prior to pgCon 2014. In order to keep the numbers manageable, this meeting is '''by invitation only'''. Unfortunately it is quite possible that we've overlooked important code developers during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org). <br />
<br />
Please note that this year the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.4 release cycle. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies, unlike in previous years.<br />
<br />
This is a PostgreSQL Community event. Room and refreshments/food sponsored by EnterpriseDB.<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be from 9:00AM to 5PM, and will be in the "Red Experience" room at:<br />
<br />
Novotel Ottawa<br />
33 Nicholas Street<br />
Ottawa<br />
Ontario<br />
K1N 9M7<br />
<br />
Food and drink will be provided throughout the day, including breakfast from 8:30AM.<br />
<br />
[http://maps.google.ca/maps?f=q&source=s_q&hl=en&geocode=&q=novotel+ottawa&aq=&sll=49.891235,-97.15369&sspn=36.237851,79.013672&ie=UTF8&hq=novotel+ottawa&hnear=&ll=45.421528,-75.683699&spn=0.036869,0.077162&z=14&iwloc=A&layer=c&cbll=45.425741,-75.689638&panoid=Z4FUGnkZkdHAOkIxyjjS9Q&cbp=12,25.83,,0,-0.6 View on Google Maps]<br />
<br />
== Attendees ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname):<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus<br />
* Jeff Davis<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* Andres Freund<br />
* Stephen Frost<br />
* Peter Geoghegan<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Alvaro Herrera<br />
* Alexander Korotkov<br />
* Amit Kapila<br />
* Tom Lane<br />
* Heikki Linnakangas<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Dave Page<br />
* Simon Riggs<br />
* Greg Smith<br />
<br />
== Unable to attend ==<br />
<br />
* Dimitri Fontaine<br />
* KaiGai Kohei<br />
* Fujii Masao<br />
* Craig Ringer; regretfully declined due to date conflict with pressing personal commitments.<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|08:30<br />
|Breakfast<br />
|<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:15<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|-<br />
|09:15 - 09:45<br />
|Planned features for the AXLE project<br />
|Simon Riggs<br />
<br />
|-<br />
|09:45 - 10:15<br />
|Replication features in PostgreSQL core<br />
|Andres Freund<br />
<br />
|-<br />
|10:15 - 10:45<br />
|Commitfest Management app<br />
|Magnus Hagander<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:45 - 11:00<br />
|Coffee break<br />
|<br />
<br />
|-<br />
|11:00 - 11:30<br />
|Pull request model for PostgreSQL<br />
|Peter Geoghegan<br />
<br />
|-<br />
|11:30 - 12:00<br />
|Making the buffer manager scalable<br />
|Peter Geoghegan, Amit Kapila<br />
<br />
|-<br />
|12:00 - 12:30<br />
|Scalability issues in Postgres<br />
|Andres Freund<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:30 - 13:30<br />
|Lunch <br />
|<br />
<br />
|-<br />
|13:30 - 14:00<br />
|Verifying the integrity of indexes<br />
|Peter Geoghegan<br />
<br />
|-<br />
|14:00 - 14:30<br />
|Auditing, changes to logging, etc.<br />
|Stephen Frost, Greg Smith<br />
<br />
|-<br />
|14:30 - 15:00<br />
|Permissions, ROLE attributes vs. GRANT options, etc.<br />
|Stephen Frost, Greg Smith<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:00 - 15:15<br />
|Tea break<br />
|<br />
<br />
|-<br />
|15:15 - 15:45<br />
|Ways to avoid the serious bugs that appeared in 9.3<br />
|Bruce Momjian<br />
<br />
|-<br />
|15:45 - 16:15<br />
|Improving testing and beta testing<br />
|Josh Berkus<br />
<br />
|-<br />
|16:15 - 16:30<br />
|Funding developers for maintenance<br />
|Josh Berkus<br />
<br />
|-<br />
|16:30 - 16:45<br />
|9.5 Schedule<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|16:45 - 17:00<br />
|Any other business/group photo<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
| <br />
|}<br />
<br />
= Meeting Notes =<br />
<br />
== Attendees ==<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus<br />
* Jeff Davis<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* Andres Freund<br />
* Stephen Frost<br />
* Peter Geoghegan<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Alexander Korotkov<br />
* Amit Kapila<br />
* Tom Lane<br />
* Heikki Linnakangas<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Dave Page<br />
* Simon Riggs<br />
* Greg Smith<br />
<br />
Alvaro Herrera was unable to make the meeting due to visa issues.<br />
<br />
== New Committer ==<br />
<br />
The PGDG congratulates Andres Freund on becoming a committer to PostgreSQL.<br />
<br />
== Meeting ==<br />
<br />
=== Planned features for the AXLE project - Simon Riggs === <br />
<br />
"Analytics on eXtremely Large European data". Working with several univerities. Allows them to receive funding from the EU; a lot of the projects are around PostgreSQL. Something like $1m in research funding. Projects are required to do PR for AXLE. Contributions are aimed at Core; if they don't go into core, they'll be extensions. Everything will be open source. Simon is under contract to complete the research.<br />
<br />
Simon will be working with the univerities and other AXLE participants to get involved in contributing to PostgreSQL. Focusing on real-world scalability for BI and DW; called "operational business intelligence". Looking to handle mid-sized data warehouses. Plans to use most funding for projects which nobody is working on elsewhere to get the most out of the funding, which is why they're not working on parallel query.<br />
<br />
AXLEs interest in security controls has to do with medical data privacy because of some of the early use cases. Peter asked about benchmarks; Simon said that they would be creating a medical data benchmark which will be open, hopefully. He doesn't have control over licensing from external researchers.<br />
<br />
AXLE contributions will be for 9.5 and 9.6. Done some work on security. Other projects are:<br />
<br />
* BDR/Replication<br />
* Security/data privacy/auditing<br />
* Mixed workload controls/resource management<br />
* VLDB management for bigger than 100TB databases,<br />
** maybe new partitioning<br />
** vacuuming large tables - especially vacuum freeze<br />
** Online upgrade via logical replication<br />
* Performance & Tuning<br />
* Minmax indexes<br />
* Event triggers<br />
* Column store/compressed data<br />
* Aggregate push-down<br />
* Bin function for data mining<br />
* Data visualization requirements like approximate queries and sampling<br />
* Hardware enabling, such as custom joins and custom scans, for GPUs and FGPAs<br />
* Materialized view substitution<br />
<br />
Haas mentioned that EDB is also working on resource management.<br />
<br />
Jeff brought up some of the special cases for hash joins mentioned on the mailing list. Simon is trying to figure out a generic API. Josh asked about matview substitution; Simon is looking for a simple case we can build up from. Josh asked about pluggable storage, and there was some discussion about pluggable storage APIs. However, it will be very difficult to deal with different heaps because of all of the assumptions built into other areas of the code. cstore_fdw was discussed as a example.<br />
<br />
The FDW API is a huge success because a lot of people want to use it to access different data. We need to decide if we're going to extend the FDW API or do something else. With cstore, we have an FDW that just creates a local file. But there's no WAL, repication, backup etc.<br />
<br />
=== Replication features in PostgreSQL core - Andres Freund === <br />
<br />
In 9.4 we have the logical decoding API. By itself it doesn't help users very much, it just helps replication authors. So what do we want in core. Andres is working on BDR/async MM. We have most of the stuff now, but want to submit the rest to core. As a first step, were thinking of integrating basic logical replication into core. External projects are not well-maintained.<br />
<br />
The kind of intergration Andres is talking about is event triggers so that we can replicate DDL. Pure external solutions are bad because users don't trust them and some are not maintained. Josh mentioned that the clustering meetings wanted bulk update/delete in core which would also support BDR.<br />
<br />
We need code to be reusable for special-case solutions. We want to enable building external solutions. Simon's user really want stuff included in core; they don't like external projects much. There was discussion about how logical replication is not just one thing.<br />
<br />
Josh gave some examples of components which could be shared between replication systems. He also suggested that the "replication upgrade" and "replicate one table" cases would be simple and common. Gsmith suggested that we need to have support for simple sharding infrastructures in our roadmap or it'll really hurt us. Discussion about the value of in-core vs. external ensued.<br />
<br />
Other things they need for logical replication were discussed.<br />
<br />
=== Commitfest Management app - Magnus Hagander === <br />
<br />
Discussed commitfest app last year. We wanted something new. So the question is what we want to do ... so we want it, or is it not worth finishing. Magnus sent out an email on May 5th to review it.<br />
<br />
Peter G. said that it pushes too much stuff to the list. Suggested that we can flag some items as don't post. Further discussions about app discussions. We need to be able to un-reject patches.<br />
<br />
The new app actively pulls in threads from the archives, so that people don't need to paste them in using the message-ids. We'd like to be able to add some kind of metadata at some point though. There was a bunch of discussion of the current features and interface. <br />
<br />
Smith asked about spinning up a copy of the app. The archives API is restricted, but the rest should work locally. There's no developer documentation at this point anyway.<br />
<br />
Magnus isn't sure if he can have it ready for the first commitfest for 9.5. Maybe the 2nd.<br />
<br />
=== Pull request model for PostgreSQL - Peter Geoghegan === <br />
<br />
Current patch model doesn't scale as well as Peter would like. Also as a reviewer can't push changes to other people's in-development patches. The pull model also preserves development history. Andres uses this for his replication stuff.<br />
<br />
The major problem with this is the lack of history in the archives. But if there was an approved workflow, this would make things a lot easier for reviewers. We want to constrain how we rebase things. There was a discussion about rebasing. Andres has a procedure for rebasing.<br />
<br />
We could manage the archiving issue with git.postgresql.org. There are some issues with that. Peter really only wants it for large patches with long development, maybe only major developers. Currently we don't accept pull requests. We'd need to change the developer and submitting a patch docs. For history reasons, it's important that it's in a git repo that the project controls. Simon brought up licensing issues.<br />
<br />
Peter will write up a doc page on the wiki with a procudure for how you would do this. Then we can try it out before we bless it. Haas suggested that we don't need a preferred way to do it, and that one is possible. Josh said that he prefers pulling from git repos to patches for testing. There was more discussion about merge vs. rebase.<br />
<br />
What Peter wants are feature branches, which we could reference by commit hash in the CF app. Folks suggested that maybe we need several defined workloads. We have a challenge in how we integrate large patches. Haas thinks that creates as many problems as it solves though. There was a lot of back-and-forth and small fixes four per hour for logical replication. And we don't want all that in the history.<br />
<br />
Discussions of various workflows ensued.<br />
<br />
=== Making the buffer manager scalable - Peter Geoghegan, Amit Kapila === <br />
<br />
Peter has done some benchmarking on the buffer manager. This benchmark needs some development, but already shows that we can saturate the buffer manager. He's been able to use this to test better buffer management algorithms. He wants to coordinate his efforts with Amit. His benchmark used entirely unlogged tables.<br />
<br />
What Amit is doing to make the buffer manager more scalable. There are two bottlenecks: the buffreelist lock, and the bufmappinglock. The first has been a topic of discussions; a new algorithm can decrease this contention by chaning how pages are freed. But that shifts contention to the mapping lock. Amit thinks there shouldn't be much contention at low numbers of buffers, but Haas contends that it's a matter of how many backends you have. Amit wants to make the hash map for the buffer much more concurrent. Also make the bufmappinglocks proportional to the number of clients or the size of the buffer.<br />
<br />
Andres removed read-only lwlock, which removed the contention on the bufmapping locks entirely. Amit will test that. There's contention on the root page even if we increase the number of partitions.<br />
<br />
Peter is trying to make the algorithm better in terms of what to cache and what not to, and Amit is trying to improve freeing of locks. Just increasing the maximum usage count will break high-concurrency workloads. <br />
<br />
Josh brought up the ARC cache, and the changes to the Linux cache. Peter mentioned CAR which is a successor to ARC. Andres disputed that Linux was moving in that direction. Simon would like the ability to track buffers outside of the buffer pool so that we can know if we want to increase the size of the cache.<br />
<br />
Peter and Andres discussed buffer algorithms at some length. We need to look harder at how our clock-sweep works. We can profile this with perf. Greg suggested that the problem is optimizing when to throw out data we don't need anymore.<br />
<br />
Jeff mentioned that pinning shared buffers also needed optimization. Amit says that the major lock he found was lwlocks. Maybe we could have a local recent pinned cache. Peter found that reference period was a much better idea than anything tied to the transaction, based on wall clock time instead of anything else. Checking clock time is expensive, though. A single lookup in the buffer mapping hash is 6000ns, it's way high. The pg_test_timing tool defines this. Greg's only workaround for checking clock time was to have a daemon which cyclically checks clock time in the background.<br />
<br />
The alternative is to count accesses in one operation or how many buffers are accessed at the same time. Haas says moving to a slower system should still evict buffers in the same way.<br />
<br />
=== Scalability issues in Postgres - Andres Freund === <br />
<br />
PostgreSQL doesn't scale as big as we want, because of heavily contended locks. The locks Andres found were different from what he expected; ProcArrayLock wasn't nearly as bad as expected. BufMappingLock was a big problem, though, as discussed. We hold that lock so shortly that it's all lwlock overhead. We want to rejigger the lock allocation, but we need to support atomic ops on the CPU in the future.<br />
<br />
Tom questioned whether atomic ops actually worked because of cache lines. Andres said that Intel has optimizations for this. But a benchmark was 400% faster. How do we deal with the portability issues? Andres proposal is that we use spinlocks for the atomic ops we don't have, or semaphores on the platforms which do them. We could just desupport all non-atomic-ops platforms, or we could require no performance regression for non-ops platforms on the other hand. Andres is proposing a compromise.<br />
<br />
The platforms we're talking about are ARMv6, which is the old ARM. And HPA-RISC. Those are really old platforms. Some worry that those code paths will get no testing. We're talking about compare-and-swap and atomic increment. How many ops will we require? There are different platforms which support different sets of atomic ops. We don't want too many combinations to support. We need a matrix of what different platforms support and what we want to use. Andres thinks that platforms either support all of them, or test-and-set only.<br />
<br />
Mostly non-atomic or limited platforms are older systems. How much do we care about those?<br />
<br />
The main issue is the need for barriers in atomic ops, which is not supported by spinlocks. Also, are we going to have atomic store and load?<br />
<br />
Andres to put out an email and matrix of supported operations.<br />
<br />
Once Andres removes the lwlock, contention on buffer pins was the next congestion. Andres wants to do buffer pins/unpins as atomic ops. Couldn't figure out how to remove the spin locks for all things, but could remove the lock for pinning without replacement.<br />
<br />
We should put up wiki docs for testing users how they could trace things to see what's being pinned/unpinned so that they can test these changes. We should get those patches in early because it will uncover other bottlenecks.<br />
<br />
Heikki will be working on CSN and replacing how snapshots are formed. There was discussion about who was going to be working on this for what.<br />
<br />
=== Scaling out to Many Nodes ===<br />
<br />
Bruce brought up the fact that PostgreSQL needs to think about scale out in core. Greg said that we can already build dumb non-relational sharding. Others questions about what we want more than that.<br />
<br />
Haas discussed stuff from PostgresXC which we might want to bring into mainstream Postgres. Like we might want a GTM in mainstream postgres.<br />
<br />
=== Verifying the integrity of indexes - Peter Geoghegan ===<br />
<br />
Peter submitted a patch called "poor man's normalizing keys", which did text sorting much faster. This is from a research paper. This technique is widely used commercially and is why hash indexes aren't much used anywhere. The way this works is that you use a cheap, broad comparitor, and then only compare items in detail if they show up as approx. equal. It's 3X faster.<br />
<br />
This needs to be generalized to all scans and btree operations, not just a tuplesort. We'd have these broad keys in the btree branch nodes for comparison. But there's an aversion to messing with the btree code because you can seriously break stuff.<br />
<br />
Having some kind of automated way to verify the integrity of indexes is critically important to that we can find index bugs before they happen. Without such a test, we can't do much experimentation with index optimizations. Also we'll need index testing for UPSERT.<br />
<br />
Haas thinks that the poor man's technique will regress for some use cases. Simon suggested some kind of automated function scan. Peter dismissed this because this is meant to be a general optimization which make all btrees faster. What we're discussing here is to have a way to check that the indexes are valid.<br />
<br />
We have to make this a switch per index anyway to make pg_upgrade work.<br />
<br />
Discussion ensued about the btree indexing patch. Examples were given about cases where it would cause a regression. However, we don't really have data on this; Heikki's worst case was caused by a specific bug.<br />
<br />
We would like both a index integrity check as well as a heap validity check. People in Stephen's class thought we should have a background worker which checks data integrity.<br />
<br />
=== Auditing, changes to logging, etc. - Stephen Frost ===<br />
<br />
This is about how to change logging to improve it and meet regulatory requirements. Stephen also wants to meet requirements in 800-53 publication. We need to be able to log information about a particular session without logging everything. We don't want to log all statements because we want to turn logging on all the time. Selects are as important as updates.<br />
<br />
pg_audit was be a good start, but just a start. But we need to track whether particular tables have been touched, like credit card information. This needs to happen even if the query goes through a view or function. Also Stephen wants auditing in core eventually. Haas argues that it doesn't need to be in core.<br />
<br />
Also, how do you check if a table is flagged for auditing? Stephen wants a reloption here, but that's a tradeoff. Stephen also would like to have syntax, not just functions. What if you want to have different auditing on the replica? The reloption should just be "audit=True". A background worker which injects log messages into a queue would be good too.<br />
<br />
Greg says that this is part of a general class of data that we can get from logs or system views which we want to have persistent information for. Like people want snapshots of stats, or last checkpoint data. Users should have a way to send a message of "please log this for me". Where we log it might be different. Greg likes a pg_stat_statements type model where we can see cumulative data in memory. Some users will require persistence of all messages, though.<br />
<br />
Kevin suggested using replication, which would be possible since you can write miscellaneos messages to the replication stream. Haas pointed out that "in-memory queue table" was another example of a different kind of storage.<br />
<br />
The rest of this can be implemented with event triggers or logging hooks and a background worker. Greg's employer is talking about having him work on 5 different areas:<br />
<br />
# Do we have access to the write data to log? Some things are not really visible to Postgres.<br />
# Filtering for audits at table/user level, whatever. Also look at where filtering is applied.<br />
# Low-overhead auditing for selects.<br />
# Storing audit data somewhere -- options other than text.<br />
# At what point does stuff go into core?<br />
<br />
Greg also remarked that the reason people like stuff in core is because it's guaranteed to be maintained, which stuff outside core is not. But it's a matter of ownership, not of which tarball it's in. Further discussion about in/out of core ensued.<br />
<br />
=== Permissions, ROLE attributes vs. GRANT options, etc. - Stephen Frost ===<br />
<br />
First, we need to reduce the need for superuser rights for so many things. Some discussion of the limitations on DEFAULT PRIVILEGES. Setting a role as "read-only" would also be nice. We would also like the ability of non-superusers to be able to create extensions. Or to limit the ability to SET certain GUCs. There is already something for extensions which Heroku uses. Haas suggested an event trigger to override default access policies.<br />
<br />
Greg brought up his security use case, which basically amounts to eliminating the superuser by assigning all of the superuser's rights to other users. There's roughly 50 places we call is_superuser. We want to figure out ways around this for all 50 and create 30 or 40 individual permissions.<br />
<br />
Haas said there's a more complicated part. What about doing things which are more fine-grained. Stuff which is boolean is easy. But stuff which is not just true or false would require other stuff. Stephen brought up the example of allowing users to use COPY, only from a specific directory.<br />
<br />
Noah suggested just using security definer functions. Dave pointed out that users don't want us loading a whole bunch of functions into the database just for monitoring. But read-only-everything for pgdump is a good simple case. What if we removed the ROLE requirement for DEFAULT PRIVs? But there are some object types which don't have permissions.<br />
<br />
Josh brought up example of webusers who run webapps as superuser because they ran into issues with permissions. Greg said that we never really examined a lot of these permissions. The Linux kernel went through this for root, and capabilities, etc. And it's still a work in progress. <br />
<br />
=== Ways to avoid the serious bugs that appeared in 9.3 - Bruce Momjian ===<br />
<br />
One of the proposed slogans for 9.4 was "no more multixact bugs". Bruce couldn't figure out whether bugs were new or old or already fixed or something else. And if Bruce can't track it, our users can't track it. But how do we prevent it from happening again.<br />
<br />
Kevin mentioned that the bugs were created by race conditions we couldn't get in a simple test. But some bugs just required a bad vacuum. And some were WAL replay bugs, which didn't show up without full page writes. The patch touched heap_am.c on a low level, and that file is a mess so we didn't anticipate the issues.<br />
<br />
Noah pointed out that as commitfests get later, people push in things they really shouldn't. Simon pointed out that that particular patch had been hanging around for 2 years, and there wasn't really time pressure. The two years were part of the problem because of incompatible concurrent changes. Simon wants to categorize changes and issues into critical/noncritical. Stuff which can break datastructures is really critical.<br />
<br />
Committer trust might have hurt Alvaro's patch. Haas didn't really look at it, nor did Tom because they mostly assumed Alvaro was correct. Like when Tom submits something, Haas doesn't look at it, which can be bad. Noah suggested asking someone else to commit your patch. Simon disagreed that it was an issue of review. The problem is lack of testing.<br />
<br />
Josh pointed out that people found the issues as soon as they upgraded. People postpone upgrading until version .4, so they won't test anything.<br />
<br />
Stephen has talked about building out a performance farm. People suggested releating in June or July instead, if people are not going to test. People test exciting new features, they don't test under-the-hood stuff. People tested 9.0 because they were waiting for replication.<br />
<br />
This was a major feature of 9.3, it was one of the main reasons to upgrade. We wanted the patch. Andres pointed out that the corruption was invisible most of the time. Maybe we should run the heap_am checker at the end of the regression tests. Stephen wants to have a heap_am background worker.<br />
<br />
=== Improving testing and beta testing - Josh Berkus ===<br />
<br />
See above, plus:<br />
<br />
Josh asked if we really have no interest in getting additional beta testers. People suggested a bug bounty. This was suggested as unrealistic. Haas asked where are our worst potential bugs in 9.4, and mentioned some patches. Maybe we should have some regression tests for them.<br />
<br />
Simon suggested crediting people on the footer of the release notes for bug fixes. As well as reviewers. Andrew suggested that we need ways to get customers involved, not just random folks on their laptops.<br />
<br />
Stephen wants to put together a performance farm, or a high-stress testing environment. We could gather sample data sets and queries to run to get more of a variety of stuff. <br />
<br />
Heikki's wal testing tool would be helpful if it could be automated. And a heap sanity checker. Noah suggest that when a test infrastructure is in the tree, people use it, like clobber_cache_always.<br />
<br />
Maybe we need a fuzz tester for PostgreSQL like the one Linux had which they spent a year fixing. It's a little harder with PostgreSQL, because they're more flexible.<br />
<br />
We also have no way for Django and other downstream projects to report test suite results. We could run those too, but it would be good to get the results back. And people QA stuff against PostgreSQL versions.<br />
<br />
Kevin suggested running DBT2 for a week on a machine. Josh suggested a performance farm could be used for this together with checkers. We should also look at the coverage of our current regression tests. We should allow adding new tests during beta. Especially additional tests which are run under different suites. Stephen suggested that we're too restrictive about accepting new regression tests. We should also untangle regression tests to that they are more idempotent. Peter suggested throwing out tests which have been passing for two years, or moving them out of the main suite.<br />
<br />
Stephen asked about getting test suites from people, like EDB. But most of the EDB tests are for EDBAS.<br />
<br />
=== Funding developers for maintenance - Josh Berkus ===<br />
<br />
One of the ways which other projects have dealt with not having a reviewer/maintainer time is by paying a few maintainers to work on the project full-time reviewing and maintaining. People asked where we would get the money. Haas pointed out that the Linux foundation pays people quite well. Haas loves the idea, but is dubious about the money.<br />
<br />
People generally approved the idea. Not sure about where the foundation would be. We do have a few people who get to work on Postgres full time for their employers. Also there are issues of governance. We would need to have trustworthy people to manage that person.<br />
<br />
This works for Linux because they have a benevolent overlord. Tom isn't quite that, do we want to give him more power. The Linux Foundation is a good example and could also be an incubator for us. Companies might not want to pay the amounts required.<br />
<br />
This bears investigation, as to whether we could make it work. It could also fund testing.<br />
<br />
=== 9.5 Schedule ===<br />
<br />
Do we want to release 9.4 early? Once concern is whether the JSONB format is going to change at all, we want to make sure that it's right. How do we know when we're done with that.<br />
<br />
Simon asked about the release date and the close date of the CommitFests. It would be better if we had more distance between the final and the first release. So for 9.5, maybe we should talk about having different schedule. Robert suggested a year-and-a-half release schedule. Fewer releases would be fewer to support. But people wouldn't like that.<br />
<br />
Having release dates we know about is useful. Getting the release out in June would require terminating CF4 sooner. But there's too much pressure to get everyone's patch in.<br />
<br />
Simon would like to do more in the summer. Right now we just use 6 months out of the CF. We can't move it earlier because of summer, though. We work on stuff in June. Robert suggested that we have enough people on our project to do a release in the summer. Peter says, no, we don't.<br />
<br />
Maybe we should add another commitfest, in Feburary then? Isn't there some overlap with the beta? Not really. We'd need to branch at the same time we do beta.<br />
<br />
We also talked about moving CF4 to Feb 15th instead of Jan 15th. There was debate and discussion about whether or not that was a good thing. Or we could have 5 commitfests.<br />
<br />
Here is the commitfest schedule for this year:<br />
<br />
* June 15<br />
* August 15<br />
* October 15<br />
* December 15<br />
* February 15<br />
* Beta in June for 9.5<br />
<br />
=== Other Business ===<br />
<br />
Robert asked about inviting Jeff Janes and Dean Rasheed.<br />
<br />
Clustering meeting report -- Josh Berkus<br />
<br />
=== End of Meeting ===<br />
<br />
<br />
== Proposed Agenda Items ==<br />
<br />
Please list proposed agenda items here:<br />
<br />
* Planned features for AXLE project (Simon) - details of which features are planned, when<br />
* Commitfest management app (Magnus)<br />
* 9.5 Commitfest schedule<br />
* Making the buffer manager scalable (Peter Geoghegan)<br />
* Scalability issues in Postgres (Andres Freund) (related to Peter's item)<br />
* Ways to avoid the serious bugs that appeared in 9.3 (Bruce)<br />
* Improving testing and beta testing (Josh Berkus) (directly related to Bruce's item)<br />
* Funding developers for maintenance? (Josh Berkus) (directly related to Bruce's item)<br />
* Verifying the integrity of indexes (e.g. structural invariants, that index comports with the heap) with a utility command (Peter Geoghegan)(Directly related to Bruce's item. I Would mostly like to improve the regression tests, to make other new improvements around the B-Tree AM more palatable.)<br />
* Auditing, changes to logging, etc. (Stephen Frost)<br />
* Permissions, ROLE attributes vs. GRANT options, etc. (Stephen Frost)<br />
* Pull Request Model for PostgreSQL Contributions (Geoghegan). There is something to be said for the LKML model, and I believe we should adopt some aspects of that model. Note that I am not proposing that we adopt merge commits, nor that we fundamentally alter our workflow in any other way; I am proposing an approved workflow that some contributors can opt for as an alternative to patch files where that makes sense. Contributors are disinclined to submit code using a workflow that is not officially approved of. I would like to have a process formalized and approved outlining how large patches may be developed in a feature branch of a remote under the control of the author. We can version revisions of the patches using commit hash references, preserving most of the advantages of patch files (perhaps snapshotting with the CF app, so there is a patch file for each version in the archives). With large, complex patches, preserving the history of a patch as it is worked on, and the author's commit message commentary seems quite valuable. Hopefully we can come up with something that weighs everyone's concerns here.<br />
<br />
<br />
<br />
[[Category:PGCon2014]]<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2015_Developer_Meeting&diff=37562PgCon 2015 Developer Meeting2023-02-10T08:42:54Z<p>Alvherre: </p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Tuesday 16 June, 2015 at the University of Ottawa, prior to pgCon 2015. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.5 release cycle. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Changes from Previous Developer Meetings ==<br />
<br />
Note that the goals for this year's "Developer Meeting" have shifted to account for the Unconference which is being held at pgCon immediately following the Developer meeting and lasting for 1.5 days (Tuesday afternoon and all day Wednesday). This year, the "Developer meeting" will be focused on non-technical issues such as timing/schedule, policies, procedures, and [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems], be they technical or non-technical in nature. The nature of such Wicked problems is that they require a sufficient number of interested individuals to make progress and generally involve both technical and non-technical issues (trade-off decisions, no clear true or false answer, no way to test if a given solution is correct, etc). The Unconference will be focused on technical discussions and design. If you have any questions regarding the nature of the Developer meeting, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
== Meeting Goals ==<br />
<br />
* Define the schedule for the 9.6 release cycle<br />
* Address any proposed timing, policy, or procedure issues<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be:<br />
<br />
* 9:00AM to 12PM<br />
* Room DMS 3120<br />
* Desmerais Building<br />
* University of Ottawa.<br />
<br />
Coffee, tea and snacks will be served starting at 8:45am. Lunch will be after the meeting.<br />
<br />
Note that this meeting is intentionally shorter this year. This is due to the Unconference being held at pgCon.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname):<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus<br />
* Jeff Davis<br />
* Andrew Dunstan (plane delayed)<br />
* Andres Freund<br />
* Stephen Frost<br />
* Masao Fujii<br />
* Peter Geoghegan<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Álvaro Herrera<br />
* Amit Kapila<br />
* Konstantin Knizhnik<br />
* Alexander Korotkov<br />
* Tom Lane<br />
* Heikki Linnakangas<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Dave Page<br />
* Simon Riggs<br />
* Teodor Sigaev<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|- <br />
|09:10 - 09:40<br />
|9.5 Release Schedule / Restore Reliability<br />
|Bruce Momjian<br />
<br />
|- <br />
|09:40 - 10:00<br />
|Should We Have a Release Team? Who?<br />
|Robert Haas<br />
<br />
|- <br />
|10:00 - 10:30<br />
|Remaining Multixact Cleanup<br />
|Andres Freund<br />
<br />
|- <br />
|10:30 - 10:40<br />
|Promoting Committers: new system?<br />
|Josh Berkus<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:40 - 10:55<br />
|Coffee break<br />
|All<br />
<br />
|-<br />
|10:55 - 11:20<br />
|9.6 Schedule / Quarterly Security Releases<br />
|Stephen Frost<br />
<br />
|- <br />
|11:20 - 11:45<br />
|Getting sponsored more time for review/commit/bugfix <br />
|???<br />
<br />
|- <br />
|11:45 - 12:00<br />
|Any other business<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:00<br />
|Finish<br />
|<br />
|}<br />
<br />
= Developer Meeting Notes =<br />
<br />
== Attendees ==<br />
<br />
Dunstan, Davis absent<br />
<br />
No introductions<br />
<br />
== 9.5 Release Schedule / Restore Reliability ==<br />
<br />
Bruce: I was taking to people yesterday and I feel a lot better than I felt 3 months ago. We've gotten complacent about reliability, but now we have stuff which doesn't get fixed with a simple bugfix. We can work on problems instead of staying on the release schedule. Because we haven't had a case like this. Mostly mulitxacts, but it could have been any failure in our process. We're super-reliable, but we're so used to it that we haven't tried to focus on reliability. It appeared in LWN.net, I think that was neutral. We need to do differently, and we're doing that.<br />
<br />
Heikki: what could we do differently? Bruce: test harness for the WAL. That's a good example, we can't just wait for user bug reports.<br />
<br />
Heikki: if we develop new features, we should make them testable. Josh: we don't have organized crash recovery testing. Tom: a crash test recovery framework would need to evolve; we couldn't have caught some of the issues with tests we knew about.<br />
<br />
Noah: we choose our balance between reliability and adding new features. We're getting the bug level we can expect. Question is, do we like that balance?<br />
<br />
Kevin: but we fell down on this because people missed posts about issues. Alvaro actually posted about some of the outstanding bugs, but people missed the emails. We should have a more visible TODO list. Haas: we should create a wiki page now. Noah: should also be on the 9.5 Open Items list. Haas: we keep finding other things with multixacts. The list needs to be more visible. Let's start with a wiki page. Andres: sounds like a bugtracker implemented in a wiki page.<br />
<br />
Bruce: we made a joke about the bugtracker. I don't want to address that but I'm going to. We've been blessed in being able to turn bugs around quickly; now we have a bug which we can't fix quickly. Other projects have to deal with this all the time. Haas: most of the stuff we don't deal with nobody who can code cares about. We have 1100 things open against EDBAS, and most of those things are not worth fixing. We need to do feature development. I'm not opposed to a bug tracker, but it will fill up with unimportant stuff. The Multixact thing is different: we need to fix that. You need to curate a bugtracker.<br />
<br />
Amit: new people like to look at bugtrackers as a place to get started contributing. Lowers barrier of entry for contributing. Dave, Haas: people pick things off the TODO list, but that's aged out. Noah: "If there's something I sort of think I should feel guilty about not fixing, I put it in the TODO list." Oleg: we should hire a full-time project manager. Some issues brought up. Discussion ensued.<br />
<br />
What does having a bugtracker have to do with this? Well, it might have allowed us to not lose track of the Multixact issues. How do we close bugs? People need to take no for an answer. Frost: Debian does a good job of this.<br />
<br />
9.5 Schedule: should we put a beta or something out next week? Andres says a new round of fixes will be added in the next 2 weeks. But that doesn't affect beta. Heikki: what are the outstanding items? It actually looks pretty good, Magnus "It's a little too good looking", we need to release a beta so people find bugs.<br />
<br />
Haas: what do we think in this release will cause us horrible regrets. Noah: the most risky things affect persistent state. WAL format change, UPSERT. But if there's a bug in the WAL format, then we can fix that during beta. The things we need to fix any UI issues before beta, not bugs. Do we have open questions on RLS? We need to fix those before beta.<br />
<br />
RLS needs a lot more documentation (Simon). Heikki asked for show of hands as to how RLS worked, less than half the room raised hands. Heikki is worried that too many people who have not looked at it. Peter brought up the issue with planquals; folks pointed out that that's not a new issue to 9.5. Frost thinks there are potentially changes to the UI in RLS, especially to RETURNING sets. How do we handle these? There are issues with RLS which need to be resolved before beta.<br />
<br />
Andres says that UPSERT doesn't really affect stability. There are no persistent state changes, so corruption risk is very low. Race conditions, such but, it can't even abort a transaction. <br />
<br />
Simon said we should have a guide to beta testing. Vote on alpha/beta/whatever. Heikki pointed out that a early beta so that people will actually try features and give feedback. Vote was putting an alpha out immediately, more than 2/3 majority.<br />
<br />
== Should We Have a Release Team ==<br />
<br />
Haas: There seems to be a lot of inertia around creating a release when there's a bug fix. It seems to be very slow before we even talk about doing a release. I was disappointed when nobody was following what was going on with the multixact stuff. I think having a bigger group of people on a closed mailing list would make things happen -- when are we going to do a release and why are we going to do a release. More technical people who understand the bug.<br />
<br />
Dave: ultimately the packagers need to decide when the release happens, since they're doing the work. Kevin: the bigger issue was that if you were following the discussion you'd know that there was a data-eating bug, and the core team missed that. The problem with the process was that the people who know about the severity couldn't see that the core team had missed the problem. Dave: you know that I don't hack on the server. I miss those discussions. Someone can propose that clearly on Hackers. <br />
<br />
Simon: we don't have a mechanism for deciding which things are really severe. Oleg: like on Facebook? Alvaro: we need a way to tag things as really important. Add a tag which says "release". Or we could create a separate list. Simon: that would just have the same issue for whoever is not on the list. Dave: use the packagers list. Magnus: that's not what that list is for. Frost: maybe we should just use the security list, most of the relevant people are already on it.<br />
<br />
Packagers is not the part which is not working. They do fine. The problem is that nobody started the process. You can't tell from the outside if core discussed something or not. There's no reason why the discussion around should we do a release should be confined to 6 people. Last year we went 5 months without doing a release, and you can't tell from the outside what's going on.<br />
<br />
Haas thinks we should have a new release list with Core + active committers + a couple other people. There is a strong overlap with security. Dave brought up primary and secondary packagers. Primary packagers should be on this list. But we need to not disclose security issues. <br />
<br />
Dave proposed to create a pgsql-releases@ mailing list, initially including the committers in the room and Core. Discussion over sharing details of security releases. Need to work out details of who's on it. Need to figure out split between security and releases list. Should the security list be able to decide that we're having a release?<br />
<br />
== Regular Security Releases ==<br />
<br />
(quick schedule rearrangement to continue discussion)<br />
<br />
Heikki thinks we should have a quarterly update release. Dave is concerned that that will make users will be confused by extra urgent releases. But we still need to do that. It's at least once a quarter, even without serious bugs.<br />
<br />
We don't want to do 3 releases in 4 weeks again. <br />
<br />
Doing an update at least quarterly. "At least one update per quarter", rather than specific dates. Target dates would be nice, but we might not want to make them public, just shared with packagers. <br />
<br />
Noah pointed out that we need a way to know we've made a decision to release or not. Dave thinks we want one person whose responsibility is to make a final decision. More discussion about different systems for doing final determination.<br />
<br />
Haas suggested we should just set up the mailing list, and we'll be able to try and figure out what works and what doesn't. And then we'll discuss what worked next year. Bruce says the value of a release list is that it will add specialists on different areas of the code.<br />
<br />
== Remaining Multixact Cleanup ==<br />
<br />
Andres: there's a number of relatively bad, but hard to hit bugs. Only 2 or 3 people really understand how multixacts work. The big thing is that mxact truncation does not work correctly on standbys and during crash recovery. So we'll need to add a WAL record for truncation, but it happens fairly infrequently.<br />
<br />
We need to make changes and test this. Heikki's harness doesn't test SLRUs. It's a combination failure which is hard to simulate. Tom: if we're saying that "we're going to take the technology which works for CLOG and apply it to mxact" seems pretty straightforwards. <br />
<br />
Haas: we should make the wiki page so that everyone understands what's going on. Tom: should we push a fix for this into the alpha? Andres was wondering about this. We should push to all branches, but we should do it before alpha. Tom says that we shouldn't do that. Definitely should be committed to the alpha. <br />
<br />
Kevin: do we have any remaining data loss bugs? Andres: yes. During crash recovery. Alvaro: the problem generally shows up un unusual configurations. For example, if you replay multiple checkpoints. But that happens during PITR.<br />
<br />
We'll put up a wiki page, and then have a more meaningful discussion.<br />
<br />
== Committers ==<br />
<br />
Josh: currently core chooses committers in closed session. Should we have a different system?<br />
<br />
Andres: core should decide more than once a year.<br />
<br />
Haas: people who are involved in the CF process should be discussing who should be nominated as committers. Magnus: Core often polls people, but there isn't an outside discussion. Haas feels that there's people outside core who could make recommendations on people we can trust to commit.<br />
<br />
Dave: the question is, should we always poll the committers? <br />
<br />
Andres: should we have docs-only committers. Core had a recent discussion on this, but didn't want to go ahead. <br />
<br />
Nobody in the room was willing to push forward anyone as a new committer on discussion. Haas is worried that we give short shrift to the Japanese contributors. We tend to lump them together, which isn't fair. "They're not one big Japanese guy." One Japanese contributor was mentioned as potential, but not sure he's ready yet.<br />
<br />
People should definitely propose people to the Core team for committers. Haas brought up the issue that discussions with Core don't get enough input because it's Core and one individual. Why are committers discussions secret? Tom: because we don't want it to be public that we passed over them.<br />
<br />
There as discussion about setting up a closed committers list for discussing nominations. Kind of sounds like the release list. Tom was doing some stats: 13 very active committers, a few who have done commits in the last 6 months, and then 4-5 who haven't committed in years. We need a policy on bouncing inactive committers, and returning inactive committers. Are we worried about old committers coming back and committing stuff? Not really, but having the keys out there is a security risk.<br />
<br />
Haas suggest that only active committers can be included on the closed lists. Dave suggests a policy that anyone who doesn't commit in X months becomes an "inactive committer". And gets taken off mailing lists. Need to figure out numbers; Tom just did numbers in the last couple of weeks. Frost: a few people are active in the community even if they aren't committing.<br />
<br />
Dave proposed some action items:<br />
<br />
* we have a list of the active committers and all of Core. <br />
* use that list for proposing committers moving forward.<br />
* no archives. under security monitoring list<br />
<br />
Use that list for discussion of committers to determine status. Motion passed.<br />
<br />
Dave proposed some rules on retiring inactive committers. 24 months zero commits, they get removed from the active lists. Passed.<br />
<br />
== 9.6 Schedule ==<br />
<br />
We decided to release a 9.5 alpha very soon. Then betas, etc. Haas predicts that if we do an alpha now, we'll do a beta in the fall. And the final won't come out until the end of the year.<br />
<br />
Simon says that there needs to be significant Dev time between final release and feature freeze of next version. Josh suggest releasing a beta right after the alpha, maybe a month later. Magnus says that we should release betas more frequently. Dave suggested doing a beta release every month until final.<br />
<br />
We will do either an alpha or a Beta every month until we're done. Simon wants to target major events like the Europe conference. But many release people are associated with conferences. Josh suggested that we really need regular frequent releases for adoption.<br />
<br />
Magnus said maybe September is a bad target. Maybe we should shift the target to late October. Dave pointed out issues with Diwali getting in the way of packaging and testing. Late October/early November is good timing. Simon points out that letting things slip makes it hard for anyone to get anything done.<br />
<br />
A longer release cycle (18 months, 2 years) would maybe be better? Josh said that it would kill adoption. Haas pointed out that it takes 5 months from closing commits to final release. Maybe someday we can do that faster, but not soon. Discussion about November and October.<br />
<br />
Let's say Mid-October at this meeting as an ambitious target and work our way backwards.<br />
<br />
So, now set the 9.6 schedule.<br />
<br />
Overlapping CFs with beta has been an issue. Should we revisit that? Haas thinks that we have to accept that the early CFs are less productive. So when's the first CF?<br />
<br />
Is our developer bandwidth decreasing? Not so much, but the complexity of code is increasing.<br />
<br />
Discussion of CommitFest dates ensued. Lots of discussion of how to set dates etc. The dates selected were:<br />
<br />
* CF1: July 1 to July 31 2015<br />
* CF2: Sept 1 to Sept 30 2015<br />
* CF3: November 1 to November 30 2015<br />
* CF4: Jan 2 to Jan 31 2016<br />
* CF5: March 1 to March 31 2016, to end on time (sudden death patch rejection)<br />
* Feature Freeze (committer freeze): April 15 <br />
* Beta mid-June<br />
* Release mid-October<br />
<br />
This was followed by discussion of how we arrange the CFs. Noah suggested a prioritization system. He suggested a system where various committers can provide feedback on stuff as triage. Simon agreed and volunteered. Haas suggested a 2-committer veto. Frost said we need a format way to "nack" something. We need to triage stuff earlier in the process. Haas says the big problem is the endless arguments because we don't want to say "no" to people because we have scarce committer time. <br />
<br />
Heikki suggested that having a serious hacker as the CFM worked, like when he was CFM. Simon suggested using +1 and -1. Haas says that the method isn't as important as coming up with a way to kick out the stuff early.<br />
<br />
== Sponsorship of Reviewers ==<br />
<br />
PostgreSQL Europe is running a cut-rate training thing about how to be a Postgres hacker in order to train people up in hacking Postgres. This is a way to encourage more developers. <br />
<br />
Peter suggested that the main issue is not just employer time. The issue is that it's draining. Especially rejecting patches is draining, so you can't necessarily do so many.<br />
<br />
== Other Business ==<br />
<br />
Holding the developer meeting at some other conference. <br />
<br />
Many people would like to move the meeting around the world. Dave said that its important to attach it to an established conference. Suggested that New York could work. Europe was good, but it's the wrong time of year. We need to decide way in advance because companies need to sponsor travel.<br />
<br />
For Europe, what about FOSDEM? Some pros and cons.<br />
<br />
People in the room are OK with Ottawa. But what about the folks who aren't in the room. We'll start the discussion again on email with this group. The deadline to make a decision will be July 31.<br />
<br />
== Bug Reports ==<br />
<br />
Simon pointed out that we don't give any credit for bugs which are reported, or bug fixes in beta. He thinks this has led to a decrease in bug reports. Dave suggests that we should credit folks in the commit.<br />
<br />
Magnus pointed out that we need a standard format for crediting people in commit logs, so we can extract names. Someone suggested we use the same format as Linux.<br />
<br />
== Conclusion ==<br />
<br />
Several attendees remarked that this was the most productive Developer Meeting in years. People thought that it was because we removed the technical issues and only discussed project management. Also, WIFI wasn't working. Mostly it was that people had discussed most of the issues on email before the meeting.<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2016_Developer_Meeting&diff=37561PgCon 2016 Developer Meeting2023-02-10T08:42:50Z<p>Alvherre: </p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Tuesday 17 May, 2016 at the University of Ottawa, prior to pgCon 2016. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.6 release cycle. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.<br />
<br />
As at last years event, a Developer/Hacker Unconference will be held on Wednesday for in-depth discussion of technical topics.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Define the schedule for the 9.7 release cycle<br />
* Address any proposed timing, policy, or procedure issues<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be:<br />
<br />
* 9:00AM to 12PM<br />
* DMS 3105 (3rd floor)<br />
* University of Ottawa.<br />
<br />
Coffee, tea and snacks will be served starting at 8:45am. Lunch will be after the meeting.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname):<br />
<br />
* Oleg Bartunov<br />
* Josh Berkus<br />
* Joe Conway<br />
* Jeff Davis<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* Andres Freund<br />
* Stephen Frost<br />
* Etsuro Fujita<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Heikki Linnakangas<br />
* Amit Kapila<br />
* Alexander Korotkov<br />
* Tom Lane<br />
* Noah Misch<br />
* Dave Page<br />
* Michael Paquier<br />
* Simon Riggs<br />
* Masahiko Sawada<br />
* Teodor Sigaev<br />
<br />
== Agenda Items ==<br />
<br />
Please add suggestions for agenda items here.<br />
<br />
* Minutes. Why is this meeting minuted? Tasks and actions are OK to record, but minutes prevent discussion of certain topics. Discuss stopping taking of minutes, as it used to be.<br />
<br />
* (Major) Contributors. The lists of contributors and major contributors on the web site are not always up to speed with who is currently contributing. I think these lists should be updated (both as to adds and removals) more aggressively. The list of invitees to the developer meeting has tends to stagnate somewhat. Can we come up with a better way to keep this information up to date? [Robert Haas]<br />
<br />
* Core Team. Should core team members continue to hold indefinite tenure and be chosen solely by the existing core team? Is the current membership of the core team the best set of people for its current purposes? It started as a release group, but that function has somewhat been taken over by pgsql-release. [Robert Haas]<br />
<br />
* [[Postgres Professional roadmap]] and interacting with community and other companies. It's not a news that large open source projects like PostgreSQL are not driven by enthusiasts only. Significant part of development and other efforts is supplied by commercial companies. PostgresPro is a new company which contributes PostgreSQL. Despite some PostgresPro developers are community members many years they are also only starting to contribute on behalf of company. PostgresPro now has resources for contributing PostgreSQL and will have more resources for that in future. The aim of this topic is find a way to use these resources most efficiently for both community and company. We share our development roadmap. We'd like to find a way to act most coordinated, evade duplicated efforts and contradictions of interests. [Alexander Korotkov]<br />
<br />
* Feedback for PostgreSQL 9.6 Release Management Team [Robert Haas]<br />
<br />
* Review of PostgreSQL 9.6 Release Management Team [Simon Riggs]<br />
<br />
* Schedule for 9.7 [Robert Haas, Simon Riggs]<br />
<br />
* Sharding/Clustering. Should there be a preferred approach? Are certain approaches blocked? [Simon Riggs]<br />
<br />
* Cross-node transactional consistency. Do we want it in core at all? [Simon Riggs]<br />
<br />
* Logical replication [Simon Riggs]<br />
<br />
* Shared testing infrastructure [Amit Kapila]<br />
<br />
* Future Version numbers: we have three credible proposals: (a) keep things as they have been; (b) always increment the first digit and turn the second digit into update number, discarding the third digit; (c) always update the first digit, using the third digit for updates and keeping the middle digit as 0 except for special circumstances. If we choose (a) we have the follow-on question of determining whether the next version will be 9.7 or 10.0.<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|- <br />
|9:10 - 9:25<br />
|Contributors<br />
|Robert Haas<br />
<br />
|- <br />
|9:25 - 9:40<br />
|Core team<br />
|Robert Haas<br />
<br />
|- <br />
|9:40 - 9:55<br />
|Release management team<br />
|Robert Haas/Simon Riggs<br />
<br />
|- <br />
|9:55 - 10:05<br />
|9.7 Schedule<br />
|All<br />
<br />
|- <br />
|10:05 - 10:20<br />
|Shared testing infrastructure<br />
|Amit Kapila<br />
<br />
|- <br />
|10:20 - 10:30<br />
|Postgres Pro Roadmap<br />
|Alexander Korotkov<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 10:45<br />
|Coffee break<br />
|All<br />
<br />
|- <br />
|10:45 - 11:00<br />
|Sharding/clustering<br />
|Simon Riggs<br />
<br />
|- <br />
|11:00 - 11:15<br />
|Cross-node transactional consistency<br />
|Simon Riggs<br />
<br />
|- <br />
|11:15 - 11:30<br />
|Logical replication<br />
|Simon Riggs<br />
<br />
|- <br />
|11:30 - 12:00<br />
|Any other business<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:00<br />
|Lunch<br />
|<br />
|}<br />
<br />
== Minutes ==<br />
<br />
=== 09:00 - 09:10 Welcome and introductions Dave Page ===<br />
<br />
Attending:<br />
<br />
* Oleg Bartunov, Postgres Professional<br />
* Josh Berkus, Red Hat<br />
* Joe Conway, Crunchy Data<br />
* Jeff Davis, AWS<br />
* Andrew Dunstan<br />
* Peter Eisentraut, 2nd Quadrant<br />
* Andres Freund, Citus Data<br />
* Stephen Frost, Crunchy Data<br />
* Etsuro Fujita, NTT<br />
* Kevin Grittner, EnterpriseDB<br />
* Robert Haas, EnterpriseDB<br />
* Magnus Hagander, Redpill Linpro<br />
* Heikki Linnakangas, Pivotal<br />
* Amit Kapila, EnterpriseDB<br />
* Alexander Korotkov, Postgres Professional<br />
* Tom Lane, Crunchy Data<br />
* Noah Misch<br />
* Dave Page, EnterpriseDB<br />
* Michael Paquier, VMWare<br />
* Simon Riggs, 2nd Quadrant<br />
* Masahiko Sawada, NTT<br />
* Teodor Sigaev, Postgres Professional<br />
<br />
=== 9:10 - 9:25 Contributors Robert Haas ===<br />
<br />
Robert Haas raised the issue that the contributors list isn't promptly updated. The process has been a "spare time" activity for the core team. Simon suggested a process. Josh pointed out that the once-a-year process wasn't working. Stephen suggested a small team. Simon thinks it should be part of the release process.<br />
<br />
Dave points out that the major contributor promotion is an issue. Robert Haas suggested something for the mailing lists. Noah wants some objective criteria. Simon worries about people gaming the system. Heikki said two different issues: who's going to do the work, and the list of people. Who should be on the team? Discussion ensued. Robert suggested we don't necessarily want top contributors doing this.<br />
<br />
Requested for volunteers: Josh, Stephen, Joe, Dave. Stephen to follow up on creating a team. There were some questions about an emeritus list. Results of committee will be reviewed by privcommiters. Do other open source projects get this right? Not sure. Josh to ask other projects who gets this right. The team should also decide on rules for promotion/demotion. Some people go away for long periods of time and come back. Michael brought up people who blog a lot.<br />
<br />
=== 9:25 - 9:40 Core team Robert Haas ===<br />
<br />
Robert's concern is that there's a very slow turnover of core team members. However, he thinks that it's a bit too slow. No mechanism he knows of to make sure that the right people get on the team. Who would we select. The roles of the core team have shifted over the years, the core team used to be the release team, but not it's more about project governance. Heikki said that things are OK because the core team doesn't do much. Stephen said that the role makeup of the core team is pretty good, even if we change the people, some committers, some non-committers. The project steering committee is what the core team is today.<br />
<br />
Dave pointed out the role of the core team on the website. Some of those items have now been delegated. Discussion about appointing core team members ensued, we don't have a system. There is a bit of a diversity issue, which was discussed. Andrew suggested that we could consider the value of "outside directors" ala corporate boards. The core team may appoint a wider selection committee than core when we need to replace a member.<br />
<br />
=== 9:40 - 9:55 Release management team Robert Haas/Simon Riggs ===<br />
<br />
This is feedback for the RMT experiment. Josh feels that beta coming out on time trumped everything. Stephen is not entirely convinced that maybe beta wasn't ready. The deadline for beta came out of the Brussels developer meeting. Tom would have liked to see the RMT be more aggressive about rolling back patches. Amit felt that the commitfest management vs. the RMT was confusing, maybe there should be the same role? Robert said the pestering role which David Steele took on was invaluable. Joe remarked how it used to be before the commitfests. Robert would have liked to have a longer feature freeze to beta gap, there wasn't enough time to fix patches; Andres disagreed, said that having a deadline sped things up.<br />
<br />
There were some precipitous commits at the end. We think we know which those were. The RMT can't guarantee the quality of the release, they can just be gatekeepers. Committers have a responsibilty to not commit buggy code, that's on them.<br />
<br />
The general feedback on the RMT is positive. We think we want to keep the RMT as part of the process. Amit suggested the RMT get involved in commitfest management. Robert said no; the RMT should be focused on the release.<br />
<br />
Simon suggested that we need a better process for picking the CFM. Simon suggested using the release list. Discuss later.<br />
<br />
=== 9:55 - 10:05 Next Release Schedule All ===<br />
<br />
Stephen wants the feature freeze sooner. Robert says that we don't need a much bigger gap, just two more weeks. The final CF ended with record speed. The last CF could start 15 days earlier. But moving up the schedule would cause patches to be more half-baked. The RMT found time pressure too much though, they were working 7 days a week. Tom suggested that the RMT not include people who have really large patches, but others thought this was impractical. Simon pointed out the importance of rotating people through roles. Noah pointed out that we don't actually know the quality of the release yet.<br />
<br />
Robert would like to keep the beta date, but do feature freeze at Feb 15th. Simon says this at the expense of developers, who will use the time to stabilize patches themselves. Argument about dates ensued. The question is whether we're targeting the beta release date or the open the tree date. Various date schemes were discussed.<br />
<br />
The dates agreed were:<br />
<br />
* CF1: September 1 to 30th<br />
* CF2: November 1 to 30th<br />
* CF3: January 1 to 31st.<br />
* CF4: March 1 to 31st.<br />
* Feature Freeze: March 31st<br />
* May 16th: Beta<br />
<br />
=== 10:05 - 10:20 Shared testing infrastructure Amit Kapila ===<br />
<br />
Amit started with the "wicked bugs" from 9.5. Right now there is no way to share test machines in order to collaborate on bugs, because the authors can't reproduce the tests. We also don't have any way to publish performance numbers. We need some shared testing infrastructure. Dave brought up the performance machines hosted in Portland. Amit suggested buying some machines. We're not going to buy the expensive machines EDB has, like the 8-socket machine they have.<br />
<br />
Can we see if there are ways that hardware owned by companies can be made available to others? People didn't know about the shared machines. We should edit the wiki page. There's issues of administrative control. Oleg mentioned access which we're given temporarily on donated machines.<br />
<br />
Josh to look for the existing wiki page and send it to Amit.<br />
<br />
=== 10:20 - 10:30 Postgres Pro Roadmap Alexander Korotkov ===<br />
<br />
Alexander talked about the new contributors they're training. He wants to coordinate roadmaps for various companies who are contributing to PostgreSQL. They've added a wiki page with what they're working on. Stephen said that Crunchy would be happy to share theirs, can use the wiki if that works for people. Folks need to understand that roadmap information is provisional, though.<br />
<br />
Alexander stressed the importance of keeping information updated. Simon suggested that we have a collaborative roadmap, with individual's names. It would be good to know what people are working on. A combined roadmap would need to be a consensus document, and individual company roadmaps can be kept up to date better. A consensus roadmap would be inaccurate, and would encourage self-promotion. Andrew suggested having a single combined page. Robert is worried about the downside of having a "roadmap" which forces unrealistic development priorities.<br />
<br />
Discussion ensued. People and companies will post individual roadmaps. Simon will create a unifying wiki page with links to all the others.<br />
<br />
=== 10:45 - 11:00 Sharding/clustering Simon Riggs ===<br />
<br />
Simon asked about a roadmap for sharding. He thinks we need a clear statement for advocacy reasons, people ask about it all the time. We should have an unconference session on sharding, or maybe a wiki page. Simon questioned the FDW stategy; he says no details have been published, so it's impossible to judge the plan. Robert pointed out that no plan survives actual development. Discussion of the FDW sharding ensued.<br />
<br />
Fujita is not focusing on sharding, he is just improving FDWs more. Various people are working on various projects, and somehow those will come together to form sharding and clustering. EnterpriseDB isn't specifically working on FDW Sharding, that's a Bruce project; EDB is working on some specific features, like async execution and aggregate push-down.<br />
<br />
If people have alternate proposals, then propose them. None of the current projects are definitely so good they should block something else. Bruce speaks only for Bruce. Several people agreed that having a bunch of building blocks like distributed transactions useful. But Robert thinks that we're too far away from usable sharding to have a spec.<br />
<br />
=== 11:00 - 11:15 Cross-node transactional consistency Simon Riggs ===<br />
<br />
This will be a discussion with the PostgresXL team and 2nd Quadrant. Will be a session at unconference tommorrow, the Pro team will have a session on multi-master.<br />
<br />
=== 11:15 - 11:30 Logical replication Simon Riggs ===<br />
<br />
Simon mentioned that they need a node registry for logical replication. This is something which will be useful for sharding. It's a roadmap item for 2nd Quadrant. Theirs will have a graph of connections between machines. We need some concrete specifics about this. There's some stuff in PostgresXL. Everybody seemed to think a node registry was a good idea, but nobody had a model.<br />
<br />
pglogical will be submitted as smaller pieces in order to get it into core next year. It was too hard to have it be a contribution and an external working system. Petr has said that the next submission will be different, he'll be forking pglogical. August 1st seems likely for submission.<br />
<br />
=== 11:30 - 12:00 Any other business Dave Page ===<br />
<br />
==== Poll on Version Numbering ====<br />
<br />
Josh called for a straw poll on version numbering. There are four proposals currently on the table:<br />
<br />
1. Keeping things the way they are, in which we have three parts to the version number, a second "major" digit, a third "minor" digit, and a first digit which is about "marketing", or "breakage", and we argue about it. Eight people supported this.<br />
2. Moving to a two-digit number, where the first digit advances every year, and the second digit is for minor releases. Thirteen people supported this proposal.<br />
3. The same as 2, but with three digits, the middle one of which is almost always "0". Nobody supported this proposal.<br />
4. That we shift to a date-based release name. Nobody supported this proposal.<br />
<br />
One developer abstained.<br />
<br />
Dave proposed that we run this by packagers, and ask for feedback on the breakage it would cause. Depending on that, we will move to two-digit numbers. Josh also to put up a blog post looking for breakage information. Peter will drive the follow-up process.<br />
<br />
Simon called for a vote on whether the next release will be 10.0. There was already consensus that the next release is 10.0.<br />
<br />
==== Select CFM ====<br />
<br />
Amit suggested that similar to the RMT, we should have a designated team for the year. There should be one committer in that group of two or three people. Robert and Stephen pointed out some problems with that; it's hard to have a team, and it's a big time commitment. Heikki suggested that we get four volunteers for CFs next year. We could just solicit on the list, as long as we ask well before the CF starts.<br />
<br />
Core was tasked with making sure that there is a CFM three weeks in advance of each CF.<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgConf.Asia_2016_Developer_Meeting&diff=37560PgConf.Asia 2016 Developer Meeting2023-02-10T08:42:43Z<p>Alvherre: </p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for the morning of Thursday 1st December, 2016 in Tokyo, prior to PGConf.Asia 2016. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.6 release cycle. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.<br />
<br />
The afternoon will be a Developer Unconference, open to a wider audience.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Review the progress of the 10.0 schedule, and formulate plans to address any issues<br />
* Address any proposed timing, policy, or procedure issues<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The event will be held on the fifth floor (using American/Japanese style counting) in room 5A at:<br />
<br />
Akihabara Convention Hall<br />
Akihabara Dai Bldig 4F 1-18-13 Sotokanda, <br />
Chiyoda-ku,<br />
Tokyo 101-0021, <br />
Japan<br />
<br />
Please see the [http://www.akibahall.jp/data/access_eng.html website] for details of how to reach the hall.<br />
<br />
The morning session (9AM - 12PM) will be used for a structured meeting, and the afternoon session (1PM - 5PM) will be used for a 2 track mini unconference for invitees to the morning session and other interested developers.<br />
<br />
[[PGConf.ASIA2016 Developer Unconference]]<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname) and will be attending:<br />
<br />
* Joe Conway<br />
* Etsuro Fujita<br />
* Magnus Hagander<br />
* Kyotaro Horiguchi<br />
* Kohei KaiGai<br />
* Bruce Momjian<br />
* Dave Page<br />
* Michael Paquier<br />
* Simon Riggs<br />
* Masahiko Sawada<br />
* Teodor Sigaev<br />
* Tomas Vondra<br />
* Amit Kapila<br />
* Takayuki Tsunakawa<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave<br />
<br />
|- <br />
|9:10 - 9:20<br />
|10.0 Release Schedule (ref: https://wiki.postgresql.org/wiki/PgCon_2016_Developer_Meeting#9:55_-_10:05_.09Next_Release_Schedule_.09All)<br />
|All<br />
<br />
|- <br />
|9:20 - 9:35<br />
|Status report of the first two commit fests already done for PG10 development cycle.<br />
|Michael<br />
<br />
|- <br />
|9:35 - 9:50<br />
|Reviewing unreviewed patches<br />
|Simon<br />
<br />
|- <br />
|9:50 - 10:00<br />
|Revoking Committer access for inactive committers (ref: https://wiki.postgresql.org/wiki/PgCon_2015_Developer_Meeting#Committers)<br />
|Simon<br />
<br />
|- <br />
|10:00 - 10:15<br />
|Providing information for applications which support PostgreSQL<br />
|KaiGai/MauMau<br />
<br />
|- <br />
|10:15 - 10:30<br />
|Revisit varlena for larger data support (>1GB)<br />
|KaiGai<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 10:45<br />
|Coffee break<br />
|All<br />
<br />
|- <br />
|10:45 - 11:05<br />
|Transaction control for statement-level rollback and stored procedures<br />
|MauMau<br />
<br />
|- <br />
|11:05 - 11:25<br />
|Multi-model database - beyond relational<br />
|MauMau<br />
<br />
|- <br />
|11:25 - 11:45<br />
|The future of built-in Postgres sharding<br />
|Bruce<br />
<br />
|- <br />
|11:45 - 11:55<br />
|Any other business<br />
|Dave<br />
<br />
|- <br />
|11:55 - 12:00<br />
|Group photo<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:00<br />
|Lunch<br />
|<br />
|}<br />
<br />
==Agenda Items==<br />
<br />
Please list any agenda items below for inclusion on the schedule.<br />
* 10.0 Release Schedule (all) - see https://wiki.postgresql.org/wiki/PgCon_2016_Developer_Meeting#9:55_-_10:05_.09Next_Release_Schedule_.09All<br />
* Status report of the first two commit fests already done for PG10 development cycle. (Michael)<br />
* Providing information for applications which support PostgreSQL (KaiGai/MauMau)<br />
* The future of built-in Postgres sharding (Bruce)<br />
* Revisit varlena for larger data support (>1GB) (KaiGai)<br />
* Multi-model database - beyond relational (MauMau)<br />
* Transaction control for statement-level rollback and stored procedures (MauMau)<br />
* Reviewing unreviewed patches (Simon)<br />
* Revoking Committer access for inactive committers (not looking for a binding decision since not everybody is present) (Simon)<br />
<br />
==Minutes==<br />
<br />
'''Introductions:'''<br />
<br />
Anti-clockwise, starting with Dave: Dave, Magnus, Joe, Masahiko, Michael, Amit, KaiGai, Takayuki (MauMau), Simon, Tomas, Kyotaro, Etsuro <br />
<br />
<br />
'''10.0 Release Schedule:'''<br />
<br />
Dave: [Review of timeline discussed in Ottawa]<br />
<br />
Michael: We appear appear to be on track. Schedule looks to be in good shape:<br />
<br />
Amit: We need to look at the individual big patches - are we on time with them?<br />
<br />
Magnus: Do we want to push the release if we want some of the outstanding features?<br />
<br />
Dave: If we want to push the release for a specific feature, there will be strong push back.<br />
<br />
General opinion seems to be that we should not push the release - but if we want specific patches, we need to knuckle down.<br />
<br />
Bruce: Are there any specific patches? Multivariate stats<br />
<br />
Magnus: pg_logical<br />
<br />
Bruce: Is there a sense that stuff is languishing in the commitfests?<br />
<br />
Simon: MV stats is in that position.<br />
<br />
Michael: MV stats has been pushed 3 or 4 times now.<br />
<br />
''All agree schedule looks good, there should (as usual) be no pushing of the release, and that we should select a new RMT at/following the FOSDEM developer meeting.''<br />
<br />
<br />
'''Status report of the first two commit fests already done for PG10 development cycle:'''<br />
<br />
Michael: 1st commitfest was the largest ever, with 220 patches! Lots of new features submitted. At end of commitfest, noted that there were ~70 patches awaiting author. The 2nd commitfest finishes today. There are a lot of patches awaiting review, not so many waiting for author, and a couple waiting for committer. For the last couple of months, people have begun to use the CF app for tracking bug fix patches (which is good), but these are not receiving much attention. 12 patches are bug fixes ATM, only 5 are ready for committer.<br />
<br />
There are a lot of small patches which people seem to like reviewing, whilst bigger patches get less attention. For example, the cmake patch, or WAL logging for hash indexes which is a 130K patch, or MV stats which is 500K(!)<br />
<br />
Amit: Covering indexes is getting bounced from CF to CF. <br />
<br />
Magnus: That one is getting returned with feedback though - it's going through the process.<br />
<br />
Dave: Is it the norm that patches are getting bounced?<br />
<br />
Amit/Michael: Yes, for the large complex ones.<br />
<br />
Dave: So do commitfests still work?<br />
<br />
Joe: The tracking is very important<br />
<br />
Michael: Yes, the concept is very good. We can't stop people being uninterested in the reviewing, but it's great when it does work.<br />
<br />
Bruce: Some of this is just really hard, grungy work - and people don't always want to do it.<br />
<br />
<br />
'''Segway to: Reviewing unreviewed patches:'''<br />
<br />
Michael: Before the segway, we should consider large patches that won't make it - I suggest that cmake is in that position.<br />
<br />
Joe: We should have a way on the CF app to show if people actually want patches.<br />
<br />
[side discussion on Cmake]<br />
<br />
Magnus: One of the reasons for the FOSDEM meeting is that we can use that for final kicking out of patches.<br />
<br />
Simon: This does follow on. I would welcome a list of things that we all want to go into the release. That's why I wrote PITR - it was at the top of the TODO list. [To Amit]: I'm happy to review your WAL/Hash indexes patch, but it hasn't changed since September and now has failures so I can't review it. We should have a CI system for ongoing testing of large patches. We should track large changes in GIT rather than using patches.<br />
<br />
Magnus: Very few patches have git info on them in the CF app<br />
<br />
Dave: We had this discussion at lunch yesterday - essentially, a superset of committers and trusted folks could mark patches as non-malicious, at which point they can be added to a branch for daily CI builds.<br />
<br />
Simon: I just want to git-pull to get the latest version of your work, as some of these patches are huge. If we allow the patch author and committers to update a public branch, then others could submit patches to that - patch on patch.<br />
What I would like is nominate important/large patches and put them on a postgresql.org controlled git repo, which would allow them to be re-checked daily to avoid bit-rot. The author would be emailed if the patch fails, and would have access to that repo to correct bitrot. It is important that the first patch is submitted to the list, for all the normal reasons. After that we should be using the best modern tech to work collaboratively on important patches: git.<br />
<br />
Dave: This is complex - we're not going to fix it here.<br />
<br />
Joe: Let's discuss in the unconference.<br />
<br />
<br />
'''Revoking Committer access for inactive committers:'''<br />
<br />
Magnus: We have rules on this, passed at the 2015 meeting: "Dave proposed some rules on retiring inactive committers. 24 months zero commits, they get removed from the active lists. Passed."<br />
<br />
Simon: I raise this here, as it's the first dev meeting in Japan and rules would affect a Japanese developer. I propose to implement the previously discussed rule which would affect Itagaki Takahiro - and be clear that it is a general rule and nothing personal.<br />
<br />
Dave: Noted - this does need to be proposed through the private committers list.<br />
<br />
*'''ACTION ITEM''': Simon to email priv-committers.<br />
<br />
<br />
'''Providing information for applications which support PostgreSQL:'''<br />
<br />
KaiGai: Please see the wiki page for "Ecosystem:Business Intelligence (BI)". This is useful info for people who want to choose a database system. I think we need a good central point of information about what systems work with PostgreSQL.<br />
<br />
Dave: We have the Product Catalogue on the website. Why not use that?<br />
<br />
KaiGai: I think it makes sense to maintain the central point of information, not just for applications, but also for use cases/references.<br />
<br />
MauMau: Customers often ask if Postgres works with different software.<br />
<br />
Dave: So if I understand correctly, the issues are application compatibility for which we have the product catalogue, and users/case studies for which we have a page on the website that is largely unmaintained.<br />
<br />
Joe: It can be hard to get permission to release info like case studies from large companies.<br />
<br />
KaiGai: Simon, do 2ndQuadrant have case studies?<br />
<br />
Simon: My marketing people would be ecstatic if we put our case studies on postgresql.org<br />
<br />
Tomas: We could just have a form that companies can update their info on<br />
<br />
Dave: We have that, just without specific links for case studies. <br />
<br />
Magnus: We have a political decision - who should be able to publish? What about Amazon RDS or Aurora? Where do you draw the line?<br />
<br />
Simon: If it's on postgresql.org, we should copy the content to avoid abuse. Secondly - to be listed as a platinum sponsor, you must be able to provide 2 case studies (for example).<br />
<br />
<br />
'''Revisit varlena for larger data support (>1GB):'''<br />
<br />
KaiGai: Postpone to unconference, to get us back on time!<br />
<br />
<br />
'''Transaction control for statement-level rollback and stored procedures:'''<br />
<br />
MauMau: The motivation is to migrate users from other databases. We failed to aquire a particular customer following a benchmark - their legacy application used a legacy Cobol precompiler. We chose to use the Microfocus Cobol compiler. The application is essentially a batch application that issues many statements in a single transaction. The customer couldn't meet the performance requirements, one reason was that they needed statement level rollback. Both pgJDBC and ODBC emulate this using savepoints - but that has a high round-trip time. We want statement level rollback without using savepoints. What I want to ask here is, is it easy to implement statement level rollback? <br />
<br />
Amit: What is statement level rollback?<br />
<br />
Simon: He wants to use subtransactions around each statement. Use BeginInternalSubTransaction to do this, not Savepoints.<br />
<br />
MauMau: Is it difficult to implement this in PostgreSQL?<br />
<br />
Simon: I think it would be reasonable to have a parameter to allow the user to specify transaction behaviour.<br />
<br />
Michael: Can you use Craig Ringer's libpq batch handling work?<br />
<br />
Amit: It's better if you can propose a draft patch.<br />
<br />
Simon: I think we can deal with this server-side with some changes in tcop. I'm writing some notes about it now for later discussion.<br />
<br />
MauMau: Next is stored procedure support. Various people have tried to implement stored procedures<br />
<br />
Dave: You're looking for different transactional control, rather than the calling convention which is the only different in the SQL standard from stored functions?<br />
<br />
Joe: Right - the standard just differentiates on the calling convention, not the transactional side.<br />
<br />
Simon: It doesn't really matter what they're called - the point is that users should have the ability to deal with transactions. It shouldn't be a major problem - we should be able to read the definition in one snapshot, then use more snapshots as needed whilst runnning the code.<br />
<br />
''Everyone seems to agree this is useful, but noone is working on this right now.''<br />
<br />
Joe: I don't think there's 100% agreement on what a stored proceudre in Postgres actually is.<br />
<br />
<br />
'''Multi-model database - beyond relational:'''<br />
<br />
MauMau: We are seeing more use of non-relational technologies these days. Is PostgreSQL going into other areas in order to become more popular? At the moment the DB rankings show Postgres at number 4, but I'm worried. Should we do more in document storage, KV, graph etc? Does the Postgres community want to enter into the wider database war?<br />
<br />
Bruce: I think our community in general has done a poor job of adressing multi-modal, but we're so good at accepting ideas that we succeed. We've had people in the community who are visionary, and we;'ve been good at allowing them to expand. We've been successful, but we never really planned this.<br />
<br />
Tomas: We never get a use case for a multi-modal database. We're not oppposed to it though.<br />
<br />
KaiGai: People who have relational skills are more in demand. I think the key is the extendablility of PostgreSQL.<br />
<br />
Magnus: I think the issue is mostly having a good API<br />
<br />
MauMau: I heard in the past the PostgreSQL made a decision not to leave the relational world.<br />
<br />
Bruce: I think you're thinking transactional control<br />
<br />
Simon: Postgres is "beyond relational" so you're already there!<br />
<br />
Tomas: The docs still call us an ORDBMS<br />
<br />
<br />
'''The future of built-in Postgres sharding:'''<br />
<br />
Bruce: Because of various people I've talked to, I got on the bandwagon a couple of years ago that we should include built in sharding. I'm very happy with the progress we've made. A lot of work has happened in Japan, and I wanted to ask if anyone has any comments or feedback.<br />
<br />
Simon: I see a lot fo work happening, but I don't know why. If someone can explain why, then I might be able to back it. If I knew what you were doing, then I might help<br />
<br />
Etsuro: I don't have an idea of the overall picture of sharding, but my understanding is basically based on FDWs. For now, we are focussing on building such building blocks, and after improving such features, we can discuss the future of the overall architecture.<br />
<br />
Simon: If we have something to talk about now, I'd like to talk about it for 10.0. I'd like to discuss this this afternoon.<br />
<br />
Bruce: NTT maintained XC for 10 years for little gain - most of that problem was the forked code. NOW, we should be able to get the benefits with sharding.<br />
<br />
Joe: Is there a wiki page for this, listing all the building blocks and their status?<br />
<br />
Bruce: I have a file on my laptop, as I've experienced so much negativity.<br />
<br />
Simon: We need a clear statement on what people are trying to do, even if they're trying to do it in different ways. I want a detailed discussion on what we're trying to do.<br />
<br />
Dave: Is there anything on the wiki at all Bruce?<br />
<br />
Bruce: No<br />
<br />
Dave: How about creating a wiki page with your vision on it, listing the goals and how the building blocks will fit together to meet those goals. Put a disclaimer on it so people don't think it's official or an EDB roadmap.<br />
<br />
Tomas: I can't discuss it with you sensibly without a proposal. I'm concerned that we'll damage foreign data wrappers in a bid to make them more suitable for sharding.<br />
<br />
*'''ACTION ITEM:''' Bruce to document sharding ideas. [https://wiki.postgresql.org/wiki/Built-in_Sharding ''Wiki page created'']<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2017_Developer_Meeting&diff=37559PgCon 2017 Developer Meeting2023-02-10T08:42:38Z<p>Alvherre: </p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Tuesday 23 May, 2017 at the University of Ottawa, prior to pgCon 2017. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 10.0/9.6 release cycles. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.<br />
<br />
As at last years event, an Unconference will be held on Wednesday for in-depth discussion of technical topics.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Define the schedule for the 11.0 release cycle<br />
* Address any proposed timing, policy, or procedure issues<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be:<br />
<br />
* 9:00AM to 12PM<br />
* DMS 3105<br />
* University of Ottawa.<br />
<br />
Coffee, tea and snacks will be served starting at 8:45am. Lunch will be after the meeting.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname):<br />
<br />
* Aleksander Alekseev<br />
* Oleg Bartunov<br />
* Joe Conway<br />
* Jeff Davis<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* Andres Freund<br />
* Stephen Frost<br />
* Etsuro Fujita<br />
* Peter Geoghegan<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Álvaro Herrera<br />
* Kyotaro Horiguchi<br />
* Amit Kapila<br />
* Haribabu Kommi<br />
* Alexander Korotkov<br />
* Tom Lane<br />
* Amit Langote<br />
* Heikki Linnakangas<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Thomas Munro<br />
* Dave Page<br />
* Michael Paquier<br />
* Masahiko Sawada<br />
* Teodor Sigaev<br />
<br />
Apologies<br />
* Simon Riggs<br />
<br />
== Agenda Items ==<br />
<br />
* Upgrading PostgreSQL without a downtime. (Aleksander Alekseev)<br />
* Commit fest management (Michael)<br />
* Intellectual Property Issues (Andres)<br />
* Recognizing Contributors (Stephen)<br />
* Release notes scope, and giving credit (Peter E.)<br />
* Web site design (Peter E.)<br />
* 64-bit xids (Alexander Korotkov)<br />
* Adopting an indent fork (Álvaro Herrera)<br />
* ''Please add suggestions for agenda items here. (with your name)''<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|-<br />
|09:10 - 09:20<br />
|PG 11 release and commitfest schedule<br />
|Dave Page<br />
<br />
|-<br />
|09:20 - 09:40<br />
|Commit fest management<br />
|Michael<br />
<br />
|-<br />
|09:40 - 10:00<br />
|Upgrading PostgreSQL without a downtime<br />
|Aleksander Alekseev<br />
<br />
|-<br />
|10:00 - 10:05<br />
|Intellectual property issues<br />
|Andres<br />
<br />
|-<br />
|10:15 - 10:30<br />
|Web site design<br />
|Peter Eisentraut<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 10:45<br />
|Coffee break<br />
|All<br />
<br />
|-<br />
|10:45 - 11:00<br />
|Release notes scope, and giving credit<br />
|Peter Eisentraut<br />
<br />
|-<br />
|11:00 - 11:20<br />
|Recognising contributors<br />
|Stephen<br />
<br />
|-<br />
|11:20 - 11:40<br />
|64-bit xids<br />
|Alexander Korotkov<br />
<br />
|-<br />
|11:40 - 11:50<br />
|Adopting an indent fork<br />
|Álvaro Herrera<br />
<br />
|- <br />
|11:50 - 12:00<br />
|Any other business<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:00<br />
|Lunch<br />
|<br />
<br />
|}<br />
<br />
== Minutes ==<br />
<br />
=== Welcome and introductions ===<br />
<br />
Attendees:<br />
<br />
* Aleksander Alekseev<br />
* Joe Conway<br />
* Jeff Davis<br />
* Andrew Dunstan<br />
* Peter Eisentraut<br />
* Andres Freund<br />
* Stephen Frost<br />
* Etsuro Fujita<br />
* Peter Geoghegan<br />
* Kevin Grittner<br />
* Robert Haas<br />
* Magnus Hagander<br />
* Álvaro Herrera<br />
* Kyotaro Horiguchi<br />
* Amit Kapila<br />
* Haribabu Kommi<br />
* Alexander Korotkov<br />
* Tom Lane<br />
* Amit Langote<br />
* Heikki Linnakangas<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Thomas Munro<br />
* Dave Page<br />
* Michael Paquier<br />
* Masahiko Sawada<br />
* Teodor Sigaev<br />
<br />
=== PG 11 release and commitfest schedule ===<br />
<br />
Haas proposed using the '''same schedule as last year''', only shifting the<br />
dates forward by one year. Measure passed unanimously.<br />
<br />
Timing of the release branch is not yet clear.<br />
<br />
=== Commit fest management ===<br />
<br />
(Paquier) I seek feedback on recent CF management. At the end of each CF, I<br />
classify each patch's status and send a short email to the patch's thread. Is<br />
that spam? (Haas) No, it's good; some authors read their threads only. Don't<br />
decide the fate of a patch on an additional thread that doesn't include the<br />
author. CF manager does need to push other CF participants.<br />
<br />
(Freund) Don't always move patches to the next CF; use Returned with Feedback<br />
instead. Patches have traversed 4+ CFs without changing. (Haas) If something<br />
is or should be Waiting on Author at CF end, use Returned with Feedback. When<br />
authors have submitted an updated patch at the end of a CF, Paquier has moved it<br />
to the next CF. That is a good move. (Lane) I worry about patch authors not<br />
familiar with the CF process, who may not know to resubmit after Returned with<br />
Feedback. I propose moving the patch at the end of one CF, then closing it if<br />
the next CF starts with no state change.<br />
<br />
(Freund) Returned with Feedback breaks up the history, because the next<br />
submission will create a new entry. (Eisentraut) That is a tooling problem.<br />
<br />
(Paquier) CFs used to have 100 patches max, but they now see 250. Process<br />
struggles to keep up. (Haas) If we ever kept up with patch flow, we don't now.<br />
<br />
(Grittner) Patch repeatedly moving from one CF to the next can mean nobody was<br />
interested, or nobody competent had the time. (Freund) We're bad at giving<br />
early feedback that a change is not wanted. (Lane) For every patch, at least<br />
one person (the author) found it interesting. (Geoghegan, Lane) We don't<br />
outright reject very much. (Kapila) Each case is controversial. (Misch) Most<br />
almost-rejections are due to implementation, which we can't foresee early.<br />
<br />
(Eisentraut) When we move a patch to the next CF, the reviewer may not be aware.<br />
Should we clear the reviewer slot on move? (Haas) Reviewer list in CF app is<br />
often very wrong. (Paquier) Eliminate CF app tracking of reviewer? (Kapila)<br />
Reviewer field is helpful as a target for CF manager nagging. (Haas) Don't<br />
remove it, but don't trust it. (Lane) Should the CF manager aggressively remove<br />
an inactive reviewer? (Haas) I support clearing the field at CF end.<br />
(Grittner) When removing a reviewer, notify that person by email. (Linnakangas)<br />
The open items process, with Misch emails, has worked perfectly. (Eisentraut) I<br />
don't care about nag emails. Patches get stalled ~6 weeks because the listed<br />
reviewer is idle; other reviewers focus on entries with blank reviewer.<br />
<br />
(Paquier) The wiki's CF checklist is very outdated. I propose to rewrite it.<br />
Will add useful CF app links to that page or to the patch submission page.<br />
After doing those updates, I will request feedback from -hackers.<br />
<br />
=== Upgrading PostgreSQL without a downtime ===<br />
<br />
(Haas) As a technical topic, this is out of scope for this meeting. Save it for<br />
the unconference. (Page) Okay to discuss technical decisions such as openness<br />
to on-disk format changes. Technical discussion did work well in Brussels.<br />
<br />
(Korotkov, Alekseev) The community is conservative. For example, it uses Perl.<br />
We plan to allow v10->v11 upgrades but not direct v10->v12 upgrades. Is that<br />
okay? (Misch) I don't see a problem with this one-version upgrade window.<br />
(Hagander) Agreed; it's okay to need five upgrades to cross five versions.<br />
<br />
(Haas) Biggest thing that won't work is system catalog changes. (Geoghegan)<br />
Logical replication solves many of the problems.<br />
<br />
(Page) With SQL Server, you can just upgrade the binaries and restart.<br />
<br />
(Alekseev) I'm defining "zero downtime" as "indistinguishable from failover to a<br />
standby". (Grittner) Not proposing to preserve TCP connections? (Alekseev)<br />
Correct, not proposing that. (Geoghegan) Suggest starting with minor release<br />
updates. (Haas, Grittner) Under this definition of zero downtime, that's<br />
already possible.<br />
<br />
(Haas) I posted a pie-in-the-sky design for major releases. Lane found flaws.<br />
<br />
(Linnakangas) This will differ from the multiple-page-versions approaches, which<br />
still assumed use of pg_upgrade.<br />
<br />
(Page) Is it acceptable to require replication and a second server?<br />
<br />
(Herrera) Will every catalog change require code to support this? (Alekseev)<br />
No. For example, we just won't drop old catalog columns. (Davis) Challenging<br />
thanks to our use of C structures and offsets to access catalogs.<br />
<br />
(Haas) It will never work to perform physical replications across major<br />
versions. Could have v11 binaries recognize and convert a v10 cluster, but<br />
cannot have a v10 replica with a v11 master. (Freund, Momjian) Yes; the<br />
protocol is compatible, but the payload is not. (Linnakangas) It's possible but<br />
far more complicated and hard to test.<br />
<br />
(Misch) Propose conclusion that the group has no objections to this proceeding.<br />
We've provided various challenges to cover in the design.<br />
<br />
(Lane) Possible alternative is to spend time making pg_upgrade faster. (Freund)<br />
ANALYZE statistics are the hard part. (Haas, Eisentraut) pg_upgrade fails for<br />
hard-to-predict reasons.<br />
<br />
=== Intellectual property issues ===<br />
<br />
The group voted to exclude this topic from the minutes.<br />
<br />
=== Web site design ===<br />
<br />
(Eisentraut) Site is stuck in the past. We have occasional pushes to hire<br />
someone or otherwise fix it. (Frost) We did a bunch of work that didn't get<br />
merged. Showed it off at pgconf.eu 1.5 years ago. Sarah Conway of Crunchy Data<br />
is leading the design effort. Someone needs to reimplement that on a different<br />
Django version. (Momjian, Haas) This lacks project management. Who, if anyone,<br />
will do the remaining work? (Frost) I am targeting completion this fall.<br />
(Eisentraut) If I email pgsql-www every two weeks, is that annoying enough?<br />
<br />
(Misch) What's actually wrong with the current site? (Eisentraut) Looks bad on<br />
small devices. (Geoghegan) It creates a bad impression for new users.<br />
<br />
(Page) Incremental changes can be hard; changing the font size could lead to<br />
months of bikeshedding. (Hagander) Partial migration is undesirable, because it<br />
would leave the site with a blend of different designs.<br />
<br />
(Hagander) Updating the wiki is more urgent. (Eisentraut) Is there a backlog of<br />
infrastructure issues? (Hagander) Just the wiki and the mailing lists.<br />
<br />
=== Release notes scope, and giving credit ===<br />
<br />
(Eisentraut) Some items attributed to me warrant credit to important reviewers.<br />
<br />
(Linnakangas, Eisentraut) How should we treat performance improvements in the<br />
release notes? (Momjian) I can easily recognize significant feature additions,<br />
but it's harder to tell the significance of a performance improvement. I want<br />
to include changes that are actionable, that allow new uses of PostgreSQL. From<br />
1200 commits I wrote 200 release note entries, some of those associated with<br />
more than one commit. Adding commit hashes as SGML comments has been a big win.<br />
I expect changes to initial notes; for v10, the editing thread was shorter than<br />
usual. (Freund) I was dissatisfied with the original treatment of performance<br />
patches and the general statement about their value. "Will it run faster?" is<br />
the #1 thing for my users. However, I'm content with the release notes as they<br />
stand today. (Haas) Committers have an action item to write commit messages<br />
that explain the change's significance. (Momjian) Commit messages were really<br />
good. (Haas) Hard to write notes for Lane improvements that address rare query<br />
shapes. (Linnakangas) Should consciously include an end-user-visible example.<br />
<br />
(Grittner) Most releases improve two kinds of performance. First, they improve<br />
concurrency: they move the performance "knee" to the right (more clients) and<br />
make its decline shallower. Second, they make certain workloads suffer less<br />
bloat.<br />
<br />
(Conway, Haas) Including mailing list links in commit messages has been useful.<br />
<br />
(Haas) Do we keep including author names in release notes? (Freund) Linking to<br />
the commit would suffice for credit. (Eisentraut) Should we move all names to<br />
the end? (Momjian) Names at the end is okay if we do link to commits. (Haas)<br />
Linking to commits would be useful. (Momjian) Use javascript to show/hide all<br />
commit references, and/or show them on mouseover. (Dunstan) PDF docs won't<br />
handle that as well. (Eisentraut) Implementing dynamic behavior in every doc<br />
format is a lot of work.<br />
<br />
(Momjian) v10 has 10-20% fewer release note items than recent releases, but it<br />
has big items. (Andres) That will change next release.<br />
<br />
(Momjian) For 9.2 and 9.3, Josh Berkus and others gave feedback that listing<br />
reviewers was too much. (Eisentraut) Nonetheless, we should list them in the<br />
release notes. (Misch) I realize it's not the community consensus, but I remain<br />
supportive of listing reviewers at the bottom of the release notes. (Dunstan,<br />
Haas) We wouldn't mind that. (Dunstan, Misch) We don't credit one-line<br />
reviewers in our commits, so they'd be appropriately excluded from release<br />
notes. (Linnakangas) We could even credit everyone who send a mailing list<br />
post. (Geoghegan) I favor inclusion: not one-line reviewers, but almost<br />
anything else. People don't read notes end-to-end, anyway. (Eisentraut)<br />
Propose we<br />
'''collect all names mentioned in commit messages and stick them at the end of the v10 release notes'''.<br />
I will do the work. Measure passed unanimously.<br />
<br />
=== Recognising contributors ===<br />
<br />
(Frost) We have competing proposals for updating the web site contributor list.<br />
Need to define criteria. Could even just use the list Eisentraut makes for the<br />
release notes. (Haas) Don't want to add that many. When I proposed a list, I<br />
had forgotten that last year's meeting had commissioned another effort.<br />
Committed CF entries dominated the Frost proposal, and that's a problem.<br />
(Frost) I looked at email size as a way to recognize non-code contributions.<br />
(Haas) We do want to recognize non-code contributions.<br />
<br />
(Freund) We're letting perfect be the enemy of good. (Page) We should vote<br />
yea/nay on a list presented so far, and move on. (Haas) Analyzing 2016 commits<br />
took ~8hr. Fine to automate if we can, but automation is inessential. I'm<br />
willing to spend 24hr analyzing satellite projects. Frost has blocked each of<br />
my proposals. (Page) Your time is better spent on code. (Haas) I thought that<br />
for five years, too. (Page) Propose taking Conway's list, discussing anything<br />
needed, and adopting it.<br />
<br />
(Page, Haas, Geoghegan) Numbers should be a starting point; they should not<br />
dictate the final answer. (Haas) Best to remove CF data from the pile.<br />
<br />
Poll: ~40% of this room is not on the contributors list.<br />
<br />
(Geoghegan) We'd be unanimous on recognizing 10-20 people.<br />
<br />
(Linnakangas) Who will physically update the site? (Conway, Lane) Suggest<br />
converting Haas list to a web site diff and sending it to<br />
pgsql-private-committers@ for approval. (Haas) Expect more cycles of update.<br />
For example, I only classified people as past contributors when 100% confident.<br />
Complaints are inevitable.<br />
<br />
(Haas) Propose having Frost and myself sit down after this meeting and agree on<br />
something. (Momjian) I want the site updated by Thursday. (group conclusion)<br />
'''Haas and Frost may proceed to update the list''', letting let this happen<br />
quickly.<br />
<br />
=== 64-bit xids ===<br />
<br />
(Korotkov) Postgres Pro has a patch for this. Some customers have problems not<br />
fixed by 9.6 freeze work. No access to production systems for seeing details.<br />
<br />
(Misch) Community would need to see the good-case and worst-case benchmarks.<br />
<br />
(Grittner) This won't fix vacuuming to reclaim space.<br />
<br />
(Lane) Eliminating freezes is attractive.<br />
<br />
(Geoghegan) What happened to the Linnakangas version of this? (Linnakangas)<br />
It's unfinished.<br />
<br />
(Haas) We know there's more to improve here, but we don't know what exactly<br />
remains.<br />
<br />
(Haas) I will vote against widening xmin/xmax on disk. The Linnakangas approach<br />
didn't have that problem. (Linnakangas) Don't want on-disk format changes.<br />
<br />
(Linnakangas) Do problems remain despite free space map and freeze map?<br />
(Freund, Geoghegan) Yes. (Freund) I see anti-wraparound vacuum problems in 9.6<br />
three times per week, in response to configuration changes. Spending 1.8B XIDs<br />
in two days is not the worst I've seen. (Geoghegan, Freund) Nobody cares about<br />
clog size. (Haas) People do care, but not the same people who care about<br />
anti-wraparound vacuum.<br />
<br />
=== Adopting an indent fork ===<br />
<br />
(Lane) Considering adoption of Piotr Stefaniak indent fork. I studied the diff<br />
it causes and liked nearly everything. Should we embed it in the PostgreSQL<br />
tree, or should we give it a standalone repository? In-tree lets us correlate<br />
changes, and out-of-tree lets us use the latest version on all branches. (Haas)<br />
Weird to cram a standalone tool into the main tree. (Misch) Similar to TAP<br />
support modules. (Lane) Similar to much of src/tools.<br />
<br />
(Freund) Great if every contributor can easily run indent. (Eisentraut)<br />
Developers will figure it out easily, relative to other challenges of starting<br />
development like installing dependencies and configuring one's editor. (Freund)<br />
Mainstream OSs package all dependencies, so that part is easy.<br />
<br />
(Haas) We should reindent every week or every day. By end of cycle, the tree<br />
has accrued lots of divergence. It's hard to commit indent-clean code to a file<br />
that is already not indent-clean. "make indent" would be great.<br />
<br />
(Eisentraut) If we switch indent implementations, I want one with upstream<br />
maintenance. (Freund) Author of proposed replacement is a FreeBSD committer and<br />
plans to submit indent code to FreeBSD. (Misch) We won't pull from upstream<br />
automatically; syncs will be more like timezone code syncs. (Freund) The<br />
replacement is already more-maintained than pgindent. (Lane) I vote not to wait<br />
for better maintenance. (Haas) If Lane will cover any problems, let it happen.<br />
Proposed tool does improve things.<br />
<br />
(Eisentraut) Happy leaving v10 as-is (pgindent).<br />
<br />
(Lane) Distinct issues (1) whether to switch indent tools and (2) how to<br />
distribute indent tool code. (Misch) Also (3) how often to reindent.<br />
<br />
=== Any other business ===<br />
<br />
(Misch) Congested agenda today. Should we meet longer next year? (Haas)<br />
Schedule was based on an unconference starting after the meeting, which we no<br />
longer do. But we wouldn't survive a 09:00-17:00 meeting. (Hagander) Perhaps<br />
09:00-14:00?<br />
<br />
(Page) How should the other developer meetings compare to this one? Brussels<br />
has been heavy on technical topics. (Haas) Unconference is better for technical<br />
topics. We used to have lots of narrow-interest technical topics that boiled<br />
down to the presenter speaking at a non-participating group.<br />
<br />
(Linnakangas) When linking to mailing list posts in commit messages, I use a<br />
postgresql.org URL, but most use postgr.es. (Eisentraut) Should we link to the<br />
whole thread or to a single message?<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2018_Developer_Meeting&diff=37558PgCon 2018 Developer Meeting2023-02-10T08:42:32Z<p>Alvherre: </p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Tuesday 29 May, 2018 at the University of Ottawa, prior to pgCon 2018. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 11/10 release cycles. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.<br />
<br />
As at last years event, an Unconference will be held on Wednesday for in-depth discussion of technical topics.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Define the schedule for the 12.0 release cycle<br />
* Address any proposed timing, policy, or procedure issues<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be:<br />
<br />
* 9:00AM to 12PM<br />
* DMS 3105<br />
* University of Ottawa.<br />
<br />
Coffee, tea and snacks will be served starting at 8:45am. Lunch will be after the meeting.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname). Note that we can accommodate a '''maximum of 30'''!<br />
<br />
# Ashutosh Bapat<br />
# Joe Conway<br />
# Jeff Davis<br />
# Andrew Dunstan<br />
# Peter Eisentraut<br />
# Andres Freund<br />
# Stephen Frost<br />
# Etsuro Fujita<br />
# Robert Haas<br />
# Magnus Hagander<br />
# Kyotaro Horiguchi<br />
# Tatsuo Ishii<br />
# Amit Kapila<br />
# Jonathan Katz<br />
# Tom Lane<br />
# Amit Langote<br />
# Heikki Linnakangas<br />
# Noah Misch<br />
# Bruce Momjian<br />
# Thomas Munro<br />
# Dave Page<br />
# Michael Paquier<br />
# Masahiko Sawada<br />
# Teodor Sigaev<br />
# David Steele<br />
# Tomas Vondra<br />
# Gregory Stark<br />
# Takayuki Tsunakawa<br />
<br />
Apologies<br />
* Simon Riggs<br />
<br />
== Agenda Items ==<br />
<br />
* 12.0 release and commitfest schedule (Dave)<br />
* Heavier filtering for patches in last CF (Andres Freund)<br />
* How can CFs/development process put more focus on long pending patches (Andres Freund, Jonathan Katz)<br />
<br />
* ''Please add suggestions for agenda items here. (with your name)''<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:30<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|- <br />
|09:30 - 09:45<br />
|12.0 release and commitfest schedule<br />
|Dave Page<br />
<br />
|- <br />
|09:45 - 10:10<br />
|Heavier filtering for patches in last CF<br />
|Andres Freund<br />
<br />
|- <br />
|10:10- 10:30<br />
|Keeping track of features / context of features<br />
|Jonathan Katz<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 11:00<br />
|Coffee break<br />
|All<br />
<br />
|- <br />
|11:00 - 11:30<br />
|How can CFs/development process put more focus on long pending patches<br />
|Andres Freund<br />
<br />
|- <br />
|11:30 - 11:50<br />
|GDPR and how it affects us<br />
|Magnus Hagander<br />
<br />
|- <br />
|11:50 - 12:00<br />
|Any other business<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:00<br />
|Lunch<br />
|<br />
<br />
|}<br />
<br />
Note: This timetable is a rough guide only. Items will start as soon as the previous discussion is complete (breaks will not move however). Any remaining time before lunch may be used for Commitfest item triage or other activities.<br />
<br />
== Photo ==<br />
<br />
[[File:Pgcon dev team meeting 2018.jpeg|800px]]<br />
<br />
== Minutes ==<br />
<br />
Dave P. kicks off the developer meeting. Team is trying to decide the "version number" for which # meeting this is.<br />
<br />
Everyone present gives introductions. Occurs in a record four minutes.<br />
<br />
# 12.0 Release Schedule<br />
<br />
Dave P. starts discussing the schedule. Dave proposes that we release around September, 2019.<br />
<br />
No one makes a motion to change (Except Stephen, who proposes September, 2018).<br />
<br />
Dave P. asks about commitfest schedule.<br />
<br />
Robert proposes that everyone tries to jam everything into the last commitfest. The reason they do that because they know when the last commitfest would be. Robert proposes that we generate a random number: if it is odd, we release. If it is even, we do not. [Laughter]. Would like to solve the problem of people not cramming into the last minute.<br />
<br />
Tom: We have a full slot for this topic, maybe we should switch the order.<br />
<br />
Dave: It's the next one.<br />
<br />
[Laughter]<br />
<br />
Andres: Was there anything really bad schedule wise?<br />
<br />
Tom: A lot of "gross" stuff committed in the last few days of the final commitfest.<br />
<br />
Robert: For a year or two, we used to freeze earlier (Jan/Feb). Was then moved to March, the later freeze has been offset by "fear" of the release management team coming to get you. It has worked well doing it in March.<br />
<br />
Peter E: When we froze earlier, there was a distraction of people working on the release.<br />
<br />
Jonathan: Is there a way to change the behavior to encourage people to propose patches / commit earlier?<br />
<br />
Heikki: Fundamental problem of "discipline"<br />
<br />
Noah: Robert's idea not unfounded.<br />
<br />
Magnus: Would help developers with variable releases, users would not.<br />
<br />
Greg S: In the last month, there is a variable of the commit.<br />
<br />
Andres: The problem is much more we had a lot of patches in the commitfest that were updated in the middle of the commitfest. Just need to be more "cruel" about patches.<br />
<br />
Robert: Who has time for cruelty? There are so many patches on the mailing list.<br />
<br />
Andres: There were some patches that we did not thoroughly look at and some people were also unaware of the rules of the commitfest.<br />
<br />
Noah: Even if we address the patches that weren't close to ready, would it address it?<br />
<br />
Andres: I think it would help.<br />
<br />
Dave: Who is actually going to do this? Who is going to be "cruel?"<br />
<br />
Andres: Yes, if more people help with that it would be less "Andres is in a bad mood." Need to be more obvious around the rules.<br />
<br />
Robert: Yes, need to write down the expectations so everyone can see what they are. Does not mean everyone will automatically meet the expectations. Gives an analogy to parenting.<br />
<br />
Stephen: We're going to end up using vocabulary in there that is subjective, but better than nothing at all.<br />
<br />
Andres: I posted a proposal that needs a bit of polish.<br />
<br />
Amit: RMT should prioritize what could make it and what won't<br />
<br />
Stephen: RMT has been post - could possibly move it up.<br />
<br />
Greg S: Had a rule that patches in the final commitfest should be "close-to-ready." Only patches that the reviewers said that "this is close to ready" then it is approved for the final commitfest.<br />
<br />
Andres: Not sure if that technically process is the problem. The author said "I don't care, I want it to be there anyway"<br />
<br />
Peter E: If we are going to be more restrictive on these things and you can't submit stuff in the last commitfest, then we're really shrinking the window of when we should actually be doing stuff. It impacts the work we can do, then for new contributors "Oh you weren't here in November you can't contribute." I don't mind if someone submits a reasonably sized patch in March and it is good, we can work on it together.<br />
<br />
Tom: Agreed, but size of the patch has a lot to do with it.<br />
<br />
Peter E: Judgment calls: this patch is too big, too late, etc. It is human work, but we need to be inclusive.<br />
<br />
Robert: We went back to this system; then there is not as much opportunity to do as much patch review. You commit stuff and you get to the end of the release cycle, and guess what there are bugs, and you have to fix it. Last year, I was working flat out for months committing partitioning bugs. Still had to review patches. I see the value of this, I see it's quite painful. The more you restrict the window, the more things will be at the window.<br />
<br />
Andres: You want more time for development, but it creates too much work?<br />
<br />
Heikki: In the period where we are working the release [Jonathan checked an email about PostgreSQL 8.2 windows installer...]; perhaps should squeeze the window.<br />
<br />
Tom: Does not seem like we are very productive in the summer.<br />
<br />
Andres: Should we move the release?<br />
<br />
Tom: September is a great month of release. Can we do something more productive with the summer?<br />
<br />
Andres: If we are waiting to release in Sept because we are unproductive in the summer, do we get any gain?<br />
<br />
Stephen: Then do we move the last commitfest to the summer?<br />
<br />
Robert: Last committfest is when Magnus is on his boat.<br />
<br />
Noah: If we get little done in August, we should move the last commitfest to August.<br />
<br />
Jeff: Should we do a semi-major release with some changes out?<br />
<br />
Peter E: Cannot do a whole lot without catalog changes.<br />
<br />
Andres: Depends on the area you work in.<br />
<br />
Stephen: Don't want to go back to having 3 digits in release number.<br />
<br />
Jonathan: Let's not muck up release process. User standpoint rant in terms of how people plan their upgrades.<br />
<br />
Tom: Work harder documenting expectations. Changing topics: Start a new branch for the commitfest in July.<br />
<br />
Stephen: What's the difference releasing the prior branch in September vs. whenever we branch?<br />
<br />
Tom: Yeah once we branch is there any more testing getting done?<br />
<br />
Robert: Well perhaps we try to release sooner; years ago we tried to release in June. But then people said you couldn't release in July & August. The RMT process is good at preventing what used to take a long time, i.e. open items that stay open forever. Someone can make you revert your patch. Perhaps we just decide "Hey, dot-O is coming out on June 15."<br />
<br />
Peter E: The open item list quite short right now - we probably could release in a month. We need people to test their application and we need the period for it.<br />
<br />
Tom: The reason it's so short because nobody has tested it yet.<br />
<br />
Robert: Not going to get much testing between June - August.<br />
<br />
Andres + Noah: Steady stream of bugs that come in until RC 1.<br />
<br />
Stephen: Which group is it coming from? Us or users? If it's coming from outside, we could be spending more time on new development.<br />
<br />
Noah: From the hip, 2/3s outside 1/3 inside.<br />
<br />
Peter E: I did notice last year that more than half of the names in contributors list came after the beta tag.<br />
<br />
Robert: At the same time, the # of bug reports by the time we get to beta 3 and RC 1, we don't get a lot of reports. Get a surge after dot-O. Can extend contract period between beta 1 + release and there are diminishing returns. That's why it's called a final release: all bugs are done [Jonathan laughed].<br />
<br />
Andres: Propose that RMT has the best insight into state of release<br />
<br />
Peter E: RMT already has the power to branch whenver they feel like and release whenever they feel like. That's possible in terms of branching - we tried that a couple of years ago and had an early review fest in June/July. Perhaps we try it again.<br />
<br />
Stephen: I like the general idea of having a branch open earlier because from that newcomer perspective, it's nice to have a new time of year where people commit a patch.<br />
<br />
Robert: It lets committers clear some smaller things off of your plate, you have more time to give bigger things adequate attention.<br />
<br />
Tom: Stuff like that where we have a six-month window right now.<br />
<br />
Peter E: We talk a lot about the last commitfest, but the first commitfest is huge.<br />
<br />
Robert: I think we should stick to the same schedule as last year, but we will agree that we are going to document the expectations for the last commitfest better and we agree that the RMT can declare when they would like a new branch created even if we are not shipped yet.<br />
<br />
Stephen: The RMT is caring about the release, as long as they are paying attention to open items, and it is already in the mandate to open the branch early, then the commiters can pick out smaller things from the later commitfest. We can go and pick stuff out of there and commit it into the branch once it's created, and that would give us the opportunity to have that first commitfest not be massive.<br />
<br />
Tom: Would help a little bit on the easy stuff.<br />
<br />
Andrew: Doesn't really matter if it's small or not?<br />
<br />
Robert: Not really, but you may not have that time if you're working on stabilization.<br />
<br />
[Andres phone goes off]<br />
<br />
Peter E: Are we ready to branch 5 weeks from now?<br />
<br />
Jonathan: Asks about the branch<br />
<br />
Robert: Thing it depends how much we need to back patch. If branch too early, there is a lot of back-patching. Back-patching takes time and work, but if you're only going back one branch, it's not that bad. If Andres rewrites the executor for the 3rd time in 3 releases, it's a lot of work to fix bugs, but Andres will have to fix it anyway, so it load-balances out.<br />
<br />
Stephen: Maybe while we should discourage people from committing stuff from the last commit fest that's big, we should also say "hey we are going to branch, but be gentle."<br />
<br />
Tom: We could make an explicit suggestion that the point of opening the branch earlier is that by definition we are going to focus on little stuff.<br />
<br />
Noah: I think more like Robert, it's rare that back-patching two month old code is a problem.<br />
<br />
Peter E: If we are waiting for the RMT, could make it difficult to plan.<br />
<br />
Andrew: Perhaps we say after X period, we will do feature freeze. So there is an assumption in most cases we will branch relatively early.<br />
<br />
Tom: So more like RMT has veto over the feature freeze or postpone.<br />
<br />
Stephen: So to Peter's point, if RMT has veto power, it makes it easier to plan.<br />
<br />
Noah: If they can hold to a release date, they can hold to a branch date.<br />
<br />
Peter E: We can still have a commitfest and get the process rolling.<br />
<br />
Andres: We had a couple of patches where several seasoned community members said "Hey I don't want this to go in" but it went in anyway. We should be more accepting of each others’ veto - it turns out that most of the concern is reasonable. Maybe not all of it. But if two committers vetoed and it went in anyway, I found it concerning.<br />
<br />
Stephen: I found it concerning also. One thing I will mention is that it depends on how you read into things; It wasn't clear to me that people were really asking for people to hold off and perhaps being more explicit about that. I wasn't involved in this particular discussion actively, but it seemed like concerns we raised, some people thought they were addressed and they weren't, and it was unclear.<br />
<br />
Magnus: Add in commitfest app to have a person "hey i have interested in this patch"<br />
<br />
Andres: I would use that, try it, but does not address the issue if you don't decide to care, then you don't care.<br />
<br />
Bruce: I feel like if you do that, and bad things happen like in this case...<br />
<br />
Tom: There were several things that went in despite Andres & I objections.<br />
<br />
Bruce: I assumed a committer could block anything...<br />
<br />
Tom: If you have commit bit, you can put it in.<br />
<br />
Andres: Sometimes that is good.<br />
<br />
Jonathan: Just need process for "breaking ties" with commits.<br />
<br />
Stephen: +1/-1<br />
<br />
Robert: Works well on design of features, not good for dumbness of code. Uses an example about an ASCII banner of "Stephen Frost" in core code.<br />
When you put something big into core, you are responsible for it, but you're also not responsible for it. Committers basically become collectively responsible for that things when it breaks. I know that is one of my biggest concerns about things going in and they are not fully baked. If I know that the person committing it will be 100% responsible for fixing everything that breaks, I'm ok with it. If I know the effort is going to fall onto other people, such as me, I for some reason get agitated about it.<br />
<br />
Andres: If there are conflicts between committers, it is easier to override concerns in CF1 or CF2 vs. CF-LAST, it's easier to resolve earlier in the cycle than at the end.<br />
<br />
Peter E: What is the resolution in terms of the schedule?<br />
<br />
[Group is murmuring July].<br />
<br />
Robert: I am moving to Europe so I can get the entire summer off.<br />
<br />
Jonathan: Brings up the testing around Beta 1, only concerns around branching are based on user feedback from bugs. Also said willing to share notes from RMT meetings.<br />
<br />
[Discussions around committing]<br />
<br />
Andres: If we put in the document "In the last commitfest, we ask for committers to listen to objections more judiciously" then it's better than not.<br />
<br />
Dave P: It sort of implies you don't have to listen to earlier commitfests.<br />
<br />
Andrew: Don't know if writing a rule like that will have much effect.<br />
<br />
Andres: Some of the expectations that some of the people who have been working with the project for awhile they are unaware of some of the rules. Not everyone has the history of the memory behind how some decisions were made.<br />
<br />
Stephen: This discussion is a great example of not everyone is here, or not everyone is going to read through all the notes; having some kind of documentation is necessary. What that ends up being is a good question, but yes, things that going into the last CF that are large items have been seen before. Committers are expected to be looking for positive responses for other committers before pushing together a patch. If it's a big thing in the last commitfest, should try to get another committer involved.<br />
<br />
[Discussion of how to commit and getting people to agree/disagree]<br />
<br />
Bruce: If can't get everyone to agree, does the person still need to be a committer?<br />
<br />
Andres: So if I disagree with something, I lose my commit bit?<br />
<br />
Tom: Not going to get a 19-1 vote because not everyone is going to vote. More like 1-1 or 2-1.<br />
<br />
Greg S: Not clear if that one person really reviewed it and noticed they should bring up, vs. the other two people did a thorough review.<br />
<br />
Robert: Not sure if this is hard to resolve. If someone said "You forgot a semi-colon" that's a minor concern. If someone makes a thorough comment, that's pretty clear. If you're thinking about committing something, you better have read the thread on the patch you're committing. If you haven't, then we should reconsider commit bits.<br />
<br />
Greg S: The usual dilemma is the patch that adds a feature that we want vs the compromises they might have made that limited future development. It's entirely possible for a committer to disagree with the priorities that the future development is more important than the current compromises for the feature. It doesn't mean that the committer who is willing to block the patch isn't willing to live with the feature. It's more priorities & dilemmas vs. binary decision.<br />
<br />
Dave P: If the committer makes it clear, people are usually okay to live with that.<br />
<br />
We are more-or-less back on schedule. We just need to commitfest schedule will be, and I think we need someone to take up the todo to take up guidelines.<br />
<br />
Andres: I have a draft, Tom had some valid criticism, and I will re-read it at the backend of the discussion.<br />
<br />
**ACTION ITEMS**:<br />
- Andres will edit the document with commitfest guidelines<br />
- Extra commitfest in July unless RMT vetoes (2018-07)<br />
- Commitfests will be 2018-07, 2018-09, 2018-11, 2019-01, 2019-03<br />
<br />
Jon- this came up from a note on the release thread. The idea is that we go through commit messages and building the list of features etc. With my advocacy hat on, it limits the messaging we can do along the way. Can we keep track of this as we go? Payoff is that at release time we just need to review.<br />
<br />
Tom- We did an experiment about 10 years ago with building the release notes on the fly. It became obvious that multiple people had worked on it in different styles.<br />
<br />
Jon- We don't necessarily need to write the actual release notes. This could be a wiki page or a feature on the commitfest app.<br />
<br />
Peter- Are you talking about high level features or everything?<br />
<br />
Andres- I could certainly use the notes as my brain isn't big enough to hold a full release at once.<br />
<br />
Robert- If someone wrote the notes monthly, then nothing interesting would change until March 1st.<br />
<br />
Bruce- If the way the features were added was discreet, it would work. As I recall previously we had to keep rewording things as things kept changing and we had to revise the way they were presented to the user. Sometimes a small feature at the beginning of the cycle would be subsumed into a much bigger feature.<br />
<br />
Jon- You don't write the executive summary until you've written the rest. However, I think that last release we missed multi-column statistics which was a big feature which people didn't see and we should have shouted about.<br />
<br />
Greg- Doing it incrementally can make things worse - people will think more of their own features.<br />
<br />
Heikki- I've found blog posts people write very useful<br />
<br />
[various folks agree]<br />
<br />
Bruce- It's very helpful as they can include examples etc.<br />
<br />
Andres- It helps people understand if they need to upgrade etc.<br />
<br />
[much discussion about prioritization and importance of different features]<br />
<br />
Peter- At the time we don't know. My favorite is in 9.4 where replication slots were buried at the end of the docs.<br />
<br />
Greg- At both Heroku and Gitlab I found people got really excited about new features, but not from the release notes.<br />
<br />
Michael- Blogs posts are very helpful - they get us much more feedback and reports than we get from people reading the release notes.<br />
<br />
Jon- This is really about collating the list and getting the context around features so that we can keep track easily.<br />
<br />
Stephen- Are we saying we want committers to blog about every feature they commit?<br />
<br />
Robert- No blog post and you lose your commit bit!<br />
<br />
Peter- It's a good idea anyway to promote yourself and your company and get feedback etc.<br />
<br />
Noah- It doesn't sound like we have a process failure here.<br />
<br />
Magnus- It was just a bad call.<br />
<br />
Jon- So is there a way we can improve the context of what we're committing as we're going on?<br />
<br />
Bruce- The reason these things get overlooked is that some things are of strategic importance to PostgreSQL, but sometimes people aren't looking at things from that perspective so they get missed. Maybe the issue is that you (Jon) and I are looking at this more strategically than others.<br />
<br />
Robert- I got a major feature credit for a C API!!!1<br />
<br />
Magnus- We always get the marketing one version before the feature gets really good.<br />
<br />
Bruce- At least for 11 we have the top items, and we keep fleshing these things out over time.<br />
<br />
Peter- For 11 the major items fit on a page which I think is a good balance.<br />
<br />
Noah- Sounds like we're doing pretty well right now, but it doesn't seem like we have a proposal<br />
<br />
Magnus- Except that Tom is going to start a blog!!!<br />
<br />
Jon- Let's just keep this in mind, and help keep the context.<br />
<br />
Tom- I was going to suggest it would be nice to have a cumulative list.<br />
<br />
Jon- Maybe we should organize the wiki.<br />
<br />
Peter- Part of the problem is that a lot of the wiki is out of date, whilst a lot isn't.<br />
<br />
Thomas- The todo list should be called the don't do list.<br />
<br />
Andres- We can use templates on the wiki to mark info out of date.<br />
<br />
Stephen- For official info we should fold it into the main website.<br />
<br />
Peter- I suggest people are just more aggressive when editing the wiki. It's all version controlled after all.<br />
<br />
Stephen- We have an action item from that to fold official info into the website.<br />
<br />
** Jon to work on that **<br />
<br />
Peter- Advises caution to avoid making it harder to update things.<br />
<br />
Jon- We'll move all the locked pages.<br />
<br />
-- BREAK --<br />
<br />
Dave P: How can commitfest and development process put more emphasis on long pending patches.<br />
<br />
Andres: Comes up from patches that came up in the 2013 release cycle, there are a number of patches that have been in the release cycle for 3+ years and that is a) bad for PostgreSQL b) bad for contributors because for the authors it's really frustrating to go over the same thing for 4 years. c) Reviewers because it's frustrating to say "Hey this is not ready again."<br />
<br />
One crazy-ish idea I had: we just say "Hey, the last commitfest closes two weeks earlier for anything younger than six months, and anything that are patches that needs review longer than 6 months without back and forth (Except for rebasing) can be committed in the last two weeks of the commitfest." Basically some procedural push for patches.<br />
<br />
Peter E: Is your assertion that patches are mostly there, or is it that they are troublesome in general?<br />
<br />
Andres: I think both, but mostly they just need some TLC. Change this architectural detail here, change this architectural detail there. Within a week or so of effort they can be committable.<br />
<br />
So for example, incremental sort. Everyone says we want. It has been submitted since 2013 in one year increments.<br />
<br />
Robert: The one year increments thing is part of the problem.<br />
<br />
Andres: Other patches too: multivariate statistics was waiting for review for considerable amounts of time. Letting patches rot that long is not a good thing.<br />
<br />
Robert: Unless the patch is really good can get through quickly. If a patch is a couple of 1000 lines, it takes a lot of time.<br />
<br />
Amit: Some of the area the senior members or committers the areas of specialty can be divided and at the end of the day, people need to find time in the release cycle / commitfest, they will do that.<br />
<br />
Peter E: Sure, but you can't tell people what to do.<br />
<br />
Andres: People will say "Oh this looks ready I'll look at it at the end of the commitfest cycle." But it may take longer than that to properly review and feedfback.<br />
<br />
Peter E: But then you will just have people commit things two weeks earlier. I feel that at the end, there is a fear of "loss of version" and people are "I wish I could work more on this" but that's not really a technical issue.<br />
<br />
Noah: I do think it's reasonable to designate time to reviewing each patch - that's why CF process originated. I wonder is if the last commitfest is the right time for that?<br />
<br />
Andres: The reason I bring that up is because there is the hammer that you can't commit after that.<br />
<br />
Tom: I think a lot of these patches is part of the problem that Robert is alluding to which means they take a fair amount of work and probably not good material for getting them in. Would switch that around: do this earlier.<br />
<br />
Andres: July ones<br />
<br />
Robert: What would help is if someone could curate a list of these patches and just send them out. Here are the patches that have been going for more than X commitfests, in a softer way encourage people to work at those.<br />
<br />
Tom: Can we fix the web app to do that?<br />
<br />
Robert: The older the commit fest patch, the bigger the font size gets.<br />
<br />
Jonathan: At what scale?<br />
<br />
Robert: 4 pts (so linear).<br />
<br />
Peter E: But still doesn't say if it's something that isn't ready or something that will never get in.<br />
<br />
Andres: It's more the "Hasn't gotten any review, move to next."<br />
<br />
Peter E: There probably are certain patches that only require a week of work, but don't know which ones are.<br />
It is ultimately the responsibility of the people submitting the work to get it reviewed.<br />
<br />
Andres: But it can be hard due to corporate pressures. If we can get the world to contribute in this way to PostgreSQL i.e. on the reviewing side.<br />
People will get interest, but they will get interest 2 days before the freeze.<br />
<br />
Stephen: Intent of the CFs was to get committers to look at other peoples’ patches. We were supposed to be hacking on our own stuff in the off months, and then committing stuff from other people during the CFs. This has now changed to committers push their patches through the commitfest, and we push our own patches during the CF them. Led to the situation where committers are worrying about our own patches vs. other peoples'.<br />
<br />
David S: If we get more people shepherding patches throughout the year will help ameliorate the problem. Author participation is key.<br />
<br />
Tom: Assuming Robert's idea of the font size bigger for lagging reviews...<br />
<br />
Robert: There are only so many committers. If you are a non-committer who wants to get your patch committed, the number is not like 20, it's like 4-6. In 2016, there were 2 committers that committed the patches that were not self-commits of own work. In 2017, it was 4 committers (...). That means there is this ever increasing number of patches out there that are being funneled through a small number of people. Tom has gotten 12,000 LOC of code committed to PostgreSQL in 2017. That is a phenomenal achievement, that basically means people review code that does not have a lot of work to be done.<br />
<br />
This means we will need to give more people commit bits. They will not be as good at it as people who have been doing it longer. Everyone continues to learn things. One of the problems is when we give new people commit bits, we don't necessarily solve the whole problem, but for some people, even if all they do is commit their own stuff, it does spare a lot of people work that is required for junior work.<br />
<br />
We only lose if we give someone commit bit, and it generates more work for other people. The number of people who are able to review complex patches across the code base is very limited. When you have people who are consistently contributing 5,10,15K LOC each year, perhaps at some point we are being too picky. It's not like Tom's or my patches are bug free, but we know how to cover it up.<br />
<br />
Peter E: I do want to add to it as this is a scaling problem: if you look at CF entries it's 275 entries. If you have to do the math, you have to dispose of 6-8 patches per day it is unrealistic to get through at current capacity.<br />
<br />
Robert: +1 on math. Cannot have 200 patches in 30 days and 6 people and get through them all.<br />
<br />
Tomas: Not sure if committing stuff is the main problem. I have been the author of the statistics patch - been falling through a year without review. Even if I got the commit bit today, I would not push it because it did not get a review. Review is what pushes it forward. It did get reviews in last two weeks of the CF, but still not enough to get it through. Even if the community gives commit bits to more people, it does not solve the problem of the patches not getting reviewed.<br />
<br />
Andres: I think it does a bit, because when you become a committer, there is a feeling of "Hey, I need to commit stuff."<br />
<br />
Greg S: Looking at Joe C's graph here, moving to next CF is becoming more and more.<br />
<br />
[Joe C shows graph on small screen for "everyone" to see. Murmurs ensue around the room].<br />
<br />
Heikki: Pattern is if gets pushed and it gets pushed again, it gets harder for it to get in.<br />
<br />
Tomas: I think this is one of the problems with the statistics patch. Would be useful in the commitfest obligation to once-in-a-while send a summary email of the current status of the patch, here is what the patch is supposed to be in the current version.<br />
<br />
Magnus: This sounds like the annotation feature that Peter G.<br />
<br />
Thomas M: Should be able to read the commitfest message and understand what the patch does.<br />
<br />
Robert: YES PLEASE!!!1<br />
<br />
Stephen: While I agree it's good to have it in the commit message, when I go to the CF app, I'm going to read threads first. But maybe I should read commit message first?<br />
<br />
Greg S: We have a lot of reviewers that are only technically capable enough to do cursory, basic review. Perhaps something we can ask people who want to contribute but they can't do a really in-depth technical review, but they could summarize what we can do.<br />
<br />
Andres/Stephen: -1.<br />
<br />
Tomas: I think summary emails should be written by the patch author.<br />
<br />
Heikki: Depends if I can trust the author or not on it.<br />
<br />
Andres: Still better than before. Give current state, past objections, future work, etc.<br />
<br />
Heikki: Would be helpful in the threads to ask the author to do so. Still need to go through and check, but it's better.<br />
<br />
Peter E: Could still easily assess the summary.<br />
<br />
Noah: Heikki asking early in how to motivate people to spend time on this. Good to get a response.<br />
<br />
Andres: There is also a lot of patches that are quite useful but are not as attractive to people to review/commit.<br />
<br />
Noah: I think that does happen, but it's hard to detect what's happening. If every committer took the time to try to review, we can try our best to get it in.<br />
<br />
[Discussion on how to review something like the statistics patch without a math background]<br />
<br />
Tomas: Reviewing is important because it is a chance to learn, even if you are not familiar with the math. If a patch is written and it does not get a review, is it actually a patch?<br />
<br />
Robert: It also plays into some patches everyone likes, but it is not everyone's priority.<br />
<br />
Jonathan: Do need to get more people involved. Need new committers, but need to train them on how to do it well. Also need to trust the work of people when writing summaries.<br />
<br />
Tom: Just having more committers doesn't solve the problem.<br />
<br />
Robert: We don't have the ability to get people to spend time on the project, but we do have the opportunity to decide how much people contribute to the project. We may say "No, we are not going to open the funnel any wider" or we can say "We are comfortable cracking it open a little bit more, and we take the next 2, 3, 4 people who are almost ready and say 'Hey they're ready." If we don't change anything, it won't get any better.<br />
<br />
Our development velocity is limited by how many hours people to commit.<br />
<br />
Tom: Is there any way to motivate the existing committer time to spend more time on this?<br />
<br />
Andres: This is why we brought up corporate pressures. If we don't have any community back pressure "Hey can't do that right now" good will, blah blah.<br />
<br />
Heikki: Could ask committers individually to see what they would contribute more.<br />
<br />
**ACTION ITEMS**:<br />
- Raise font sizes based on push lag (or sortable column)<br />
- Patch authors send summaries regularly. CF managers send summaries to help rally the team<br />
- Look into getting more committers and getting them to review smaller patches until trained up to review large patches<br />
- Also help authors / committers facilitate reviewing other patches<br />
<br />
Dave P: GDPR is an EU directive that has to be implemented by EU members by May 25. It is a new set of privacy regulations and rules with extremely stiff penalties if you break them. It includes things like the right to be forgotten, seriously strong protections against what is classified as personally identifiable information including IP addresses. Anyone who is in the EU or does business in the EU has to comply. This includes the project.<br />
<br />
[Discussion on GDPR].<br />
<br />
[[Category:PostgreSQL Events]]<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2019_Developer_Meeting&diff=37557PgCon 2019 Developer Meeting2023-02-10T08:42:27Z<p>Alvherre: </p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Tuesday 28 May, 2019 at the University of Ottawa, prior to pgCon 2019. In order to keep the numbers manageable, this meeting is by '''invitation only'''.<br />
<br />
The invitation list for the meeting has changed this year to include representatives from various project sub-teams, for example, packagers, the release team, Code of Conduct committee and more.<br />
<br />
As at last years event, an Unconference will be held on Wednesday for in-depth discussion of technical topics.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Define the schedule for the 13.0 release cycle<br />
* Address any proposed timing, policy, or procedure issues<br />
* Receive updates from project sub-teams on their activities and discuss any resulting issues or concerns.<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be:<br />
<br />
* 9:00AM to 12PM<br />
* DMS 3105 - Desmarais Hall, 55 Laurier Avenue East<br />
* University of Ottawa.<br />
<br />
Lunch will be served after the meeting.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname). Note that we can accommodate a '''maximum of 30'''!<br />
<br />
# Joe Conway<br />
# Jeff Davis<br />
# Andrew Dunstan<br />
# Peter Eisentraut<br />
# Andres Freund<br />
# Stephen Frost<br />
# Peter Geoghegan<br />
# Robert Haas<br />
# Magnus Hagander<br />
# Stacey Haysler (present for CoC report and discussion only)<br />
# Álvaro Herrera<br />
# Amit Kapila<br />
# Jonathan Katz<br />
# Tom Lane<br />
# Heikki Linnakangas<br />
# Noah Misch<br />
# Bruce Momjian<br />
# Thomas Munro<br />
# John Naylor<br />
# Dave Page<br />
# Michael Paquier<br />
# David Rowley<br />
# Masahiko Sawada<br />
# David Steele<br />
# Robert Treat<br />
# Tomas Vondra<br />
# Gregory Stark<br />
<br />
The following people will not be in Ottawa, and do not plan to attend:<br />
<br />
* Christoph Berg<br />
* Vik Fearing<br />
* Devrim Gunduz<br />
* Andreas Scherbaum<br />
* Sarah Conway Schnurr<br />
* Alexander Korotkov<br />
* Ilya Kosmodemiansky<br />
* Amit Langote<br />
* Haribabu Kommi<br />
<br />
== Agenda Items ==<br />
<br />
* 13.0 release and commitfest schedule (Dave)<br />
* Contributor Recognition (Andres, happy to share / pass, but should be discussed)<br />
** [https://www.postgresql.org/community/contributors/ contributors] page update - how well is it working?<br />
** should the developer meeting serve as recognition? <br />
* The Evolving Developer Meeting: Goals & What We Want to Accomplish by the end of it? (Jonathan, happy to share)<br />
* SQL standard update (Peter E.)<br />
* Locale apocalypse (Peter E.)<br />
* Criteria for major version release notes (Peter Geoghegan)<br />
* Release notes depth level (Álvaro Herrera, happy to share)<br />
* ''Please add suggestions for agenda items here. (with your name)''<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|-<br />
|09:10 - 09:20<br />
|Code of Conduct report and discussion<br />
|Stacey Haysler<br />
<br />
|- <br />
|09:20 - 09:25<br />
|Core team/PGCAC update<br />
|Dave Page<br />
<br />
|- <br />
|09:25 - 09:30<br />
|Infrastructure update<br />
|Stephen Frost<br />
<br />
|- <br />
|09:30 - 09:35<br />
|Web & Advocacy update<br />
|Jonathan Katz<br />
<br />
|- <br />
|09:35 - 09:40<br />
|Funds & Sponsors update<br />
|Robert Treat/Joe Conway<br />
<br />
|- <br />
|09:40 - 09:50<br />
|13.0 release and commitfest schedule<br />
|Dave Page<br />
<br />
|- <br />
|09:50 - 10:10<br />
|Contributor Recognition: Contributors page update; how well is it working? Should the Developer Meeting serve as recognition?<br />
|Andres Freund<br />
<br />
|- <br />
|10:10 - 10:30<br />
|The Evolving Developer Meeting: Goals & What We Want to Accomplish by the end of it?<br />
|Jonathan Katz, Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 11:00<br />
|Coffee break<br />
|All<br />
<br />
|- <br />
|11:00 - 11:10<br />
|SQL standard update<br />
|Peter Eisentraut<br />
<br />
|- <br />
|11:10 - 11:20<br />
|Locale apocalypse<br />
|Peter Eisentraut<br />
<br />
|- <br />
|11:20 - 11:35<br />
|Commitfest Management<br />
|David Steele<br />
<br />
|- <br />
|11:35 - 11:50<br />
|Release notes: Criteria for major version release notes and depth level.<br />
|Peter Geoghegan, Álvaro Herrera<br />
<br />
|- <br />
|11:50 - 12:00<br />
|Any other business<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:00<br />
|Lunch<br />
|<br />
<br />
|}<br />
<br />
Note: This timetable is a rough guide only. Items will start as soon as the previous discussion is complete (breaks will not move however). Any remaining time before lunch may be used for Commitfest item triage or other activities.<br />
<br />
==Minutes==<br />
<br />
Jonathan Katz recording.<br />
<br />
===09:00 - 09:10 Welcome and introductions===<br />
<br />
'''Dave Page''': Welcome to the 10th(?) annual developer meeting in Ottawa. Lots of new faces, let's do introductions<br />
<br />
''People go around the room making introductions about what they do. Given how verbose the group is on the mailing lists, people are surprisingly brief.''<br />
<br />
'''Dave Page''': We will be having a few reports from various projects that we do not normally hear from. The reason for this is that every year I partially struggle trying to find items for the agenda. We do usually end up having good discussion at the end of the day. It has also been mentioned a number of times, and this came up at the dev meeting in Brussels is that we have always been very focused on the core project, PostgreSQL itself. Some people at that meeting felt that it would be beneficial to hear the wider view of some other aspects of the project.<br />
<br />
We will have some discussion time later on in the meeting to discuss whether or not this is working, if it's something that we want to continue, etc. We will hear from other aspects of the project. Hopefully we will find it interesting and there will be some useful discussion to be had.<br />
<br />
Kicking it off, Stacey Haysler.<br />
<br />
===09:10 - 09:20 Code of Conduct report and discussion===<br />
<br />
'''Lead by Stacey Haysler'''<br />
<br />
'''Stacey Haysler''': Code of Conduct Committee kicked off last year. One was a threat of personal violence which resulted in a personal ban. There were two reports of sexual harassment. One resulted in a request for continuing education. One resulted in a temporary ban.<br />
<br />
No current active investigations. We are encouraging conference organizers to reach out to us for educational resources for understanding how to handle a code of conduct complaint if it should come up. The CoC committee is here to help.<br />
<br />
''Q&A proceeds about general nature of incidents, how they occurred. How the CoC was modeled after other communities and what the PostgreSQL community does. Discussion of the overall effect of the CoC committee. Stacey covers the CoC process in terms of what happens when a complaint is made.''<br />
<br />
===09:20 - 09:25 Core team/PGCAC update===<br />
<br />
'''Lead by Dave Page'''<br />
<br />
''Dave Page provides an update.''<br />
<br />
===09:25 - 09:30 Infrastructure update===<br />
<br />
'''Lead by Stephen Frost'''<br />
<br />
'''Stephen Frost''': We bought a server last year -- have a project server being hosted by Magnus Hagander & co. We continue to have a lot of things donated to us. New boxes from Packet.net. We will be moving the buildfarm server because it's getting too big, will be giving it to Packet.net. Packet has given us 3 very large systems. Upgrading buildfarm will be a part of it: upgrading, partitioning, and getting it onto a version of PostgreSQL that's not EOL.<br />
<br />
The other big project going on is upgrading the wiki. It has been ongoing for 3 years. It is not a trivial to do.<br />
<br />
PGLister continues to be a good project and we continue to hack on it. Sometimes there are things that people don't like, but we go into it and improve it.<br />
<br />
'''Andres Freund''': Could we have some stats about traffic, downloads, etc. publicly?<br />
<br />
'''Stephen Frost''': We could possibly share some of the Grafana dashboards.<br />
<br />
'''Peter Eisentraut''': Who else are admins? Do we have enough people? Do we need to train new people? How many servers do we have? Who is paying for them?<br />
<br />
'''Stephen Frost''': At present, we are doing pretty well. Servers are being maintained. We have agreements with all of our providers for physical maintenance of them. Keeping up-to-date with patching. The biggest issue we have at the moment is that some of the larger projects are larger projects, and figuring out when we can spend time on that project spending on other stuff.<br />
<br />
Not sure if we need more people at the moment.<br />
<br />
'''Peter Geoghegan''': Is there a coverage issue of people being awake?<br />
<br />
'''Magnus Hagander''': Joe has ruined that by moving.<br />
<br />
'''Andres Freund''': Is there a related problem that you have been in the project for ages and the project is not renewing? Do we need newer people on it?<br />
<br />
'''Magnus Hagander''': That is a fair point.<br />
<br />
'''Stephen Frost''': Newest person is Joe on the team is Joe and he's 5 years ago.<br />
<br />
'''Robert Haas''': How do people get involved in infrastructure side of things?<br />
<br />
'''Magnus Hagander''': Accidentally on purpose.<br />
<br />
'''Robert Haas''': The process could be better, e.g. on the development side, there is a way to do so. Less clear on the sysadmin side.<br />
<br />
'''Magnus Hagander''': We have sysadmin, and non-sysadmin. Once you are on the sysadmin team, you have access to everything. It is harder to step in and volunteer on that. We can work on better ways to get separation of concerns.<br />
<br />
'''Stephen Frost''': If you are volunteering to help us run everything as opposed to some things, that would be better.<br />
<br />
Agreement to discuss more.<br />
<br />
====Actions Items====<br />
<br />
* Infrastructure team to discuss ways to recruit new members to the team<br />
<br />
===09:30 - 09:35 Web & Advocacy update===<br />
<br />
'''Lead by Jonathan Katz'''<br />
<br />
''Unfortunately the person recording could not record all responses as he was presenting. Notes read more of a summary of discussion''<br />
<br />
'''Jonathan Katz''': Main project last year was the redesign of the website. Goals where to make it easier to get to key pages (e.g. downloads, documentation), mobile friendliness, and overall ease of use.<br />
<br />
Learned lessons from poor rollout of changes to the archives viewer that made them less friendly than before, and created a "beta" period to test new documentation pages. This seemed to be a good strategy to collect feedback. Would like to work with pginfra to make said testing easier.<br />
<br />
To measure success of these initiatives as well as a variety of advocacy initiatives, looked at overall web traffic to see if there have been changes. Did a year-over-year comparison (May 28, 2018 - May 27, 2019 vs. May 29, 2017 - May 27, 2018).<br />
<br />
* Homepage ~4.8% increase. This was used as a baseline<br />
* Downloads: ~32.5% increase. Windows / Mac downloads pages had a 24% / 27% increase in traffic respectively<br />
* Documentation: 37% increase in traffic.<br />
* As an interesting note, the about page had a 63% increase in traffic. The overall content on the about page had changed to provide a higher-level overview of features PostgreSQL offered, information about its adoption, ways to get started, and some statistics about the project.<br />
<br />
Interestingly, the traffic to the press release on the PostgreSQL website between PG11 and PG10 was down ~50%, but we still see an uptick in overall numbers. On hypothesis is that we did a better job in overall awareness of the release so that our users knew what features were coming and as such were ready to download, but cannot draw any definitive conclusions.<br />
<br />
''Peter Geoghegan makes suggestions on how to make the archives mobile friendlier. Jonathan Katz asks him to report said suggestions to pgsql-www@lists.postgresql.org''<br />
<br />
Recently had the PostgreSQL 12 Beta 1 announcement. Year-over-year stats compared to PostgreSQL 11 Beta 1 announcement show a 16.6% increase in reading of press release, and a 20% increase to the downloads page.<br />
<br />
'''Other major projects include''':<br />
<br />
* Moved platform to Python 3<br />
* Documentation: remove `/static/` URL; support for SVGs to be scalable<br />
* Release notes archive now on website<br />
* Lots of cleanups<br />
** Cleanup of professional services & platform pages, as it had been 5 years :(<br />
** Reorganization of feature matrix...though probably needs more<br />
** Removal of regional contacts page<br />
** Contact page reorganization. Tried to get people to the correct points of contact more easily, but not sure if it's working<br />
* Code linting<br />
<br />
The Twitter account has grown from 0 to almost 12,000 followers over the course of 16 months. The one thing we suffer from is consistency of tweets; could be helped with by having more people.<br />
<br />
With events, a lot of events have adopted the community guidelines that have been proposed.<br />
<br />
====Actions Items====<br />
<br />
* Ongoing web projects<br />
** General navigation and navigation on specific pages<br />
** Content rewriting & cleanups<br />
** Incorporating training guidelines to list training companies on website<br />
** APT viewing project<br />
** Figuring out better ways of handling drive-by patches<br />
* Recruiting more people to Twitter team<br />
<br />
===09:35 - 09:40 Funds & Sponsors update===<br />
<br />
'''Lead by Robert Treat & Joe Conway'''<br />
<br />
'''Joe Conway''': Years ago, there was an original effort to create a NPO for PostgreSQL like the Linux Foundation. This did not happen. When that was dissolved, there was a formation of a funds group that is connected to Software in the Public Interest, which was an org that Debian formed to manage funds that come in to sponsor the project.<br />
<br />
The funds group was a team of people that were volunteered to provide governance to the project to govern the use of money. This group has been a various activities over the years. It was set up so that a Liaison would direct the funds with SPI. There was also a board adviser that would watch over the SPI board. This group had not changed over around 12 years.<br />
<br />
Last year, a bunch of things has changed. Currently there are 17 people who are active members. According to the self-agreed governance rules of that group, if you wanted to be a part of the group you would send a message funds-group@lists.postgresql.org; if you are interested in sending money you would also send a message to that group.<br />
<br />
Last year, we had to disable accepting donations via credit cards due to GDPR. We were able to recently re-enable it. We reconciled the membership as well.<br />
<br />
The PostgreSQL website was directing all the traffic to an individual, now it is going to a more generic address. Back in February, we had elections. Robert Treat is the SPI Liaison. Dave Page determined that SPI does not actually use board advisors. We changed the second position to be a backup liaison. Dave Cramer is the backup liaison. We revamped the governance documents, and migrated them to postgresql.org. Need to set a forwarding address.<br />
<br />
In terms of actual spending -- the current funds according to SPI are about $142,000 USD. Need a better system to figure out what we have spent on as we currently have to go through emails. In the past year:<br />
<br />
* $500 for artwork for PostgreSQL 11<br />
* ~$12,500 for servers<br />
* $300 for press release for PostgreSQL 11<br />
* $260 for expenses for GSoC travel<br />
* $4400 for expenses for Slonik lapel pins<br />
<br />
'''Currently under discussion''':<br />
<br />
* $500 for artwork for PostgreSQL 12<br />
* Google Code-In has donated specific money earmarked for specific purpose. Pre-advance: $700 for person<br />
* USB drives: looking to buy a bunch for $3000<br />
* Hex stickers with Slonik - $1200<br />
* Baseball caps - $1500<br />
<br />
'''Greg Stark''': I'm surprised how little we have spent on travel for conference for students and how much we have spent on merchandise.<br />
<br />
'''Peter Eisentraut''': The reason is that this group is not super well known. We need to better advertise that this exists.<br />
<br />
'''Peter Geoghegan''': Is there an issue with that someone does not feel empowered to write a real check?<br />
<br />
'''Joe Conway''': We have no gotten ourselves into the past 6 months advertising the availability of this.<br />
<br />
'''Robert Treat''': Important to understand is that this group has been around 15 years; there are also a lot of other groups (PgUS, PgEU, JPUG); the funds group is the only one that has an international charter to help, but we also try to encourage people to reach out to their local groups.<br />
<br />
'''Andres Freund''': Is there a chance that you can do a yearly report like the CoC does?<br />
<br />
'''Robert Treat''': Those records are more public, and we want to do a better job of showcasing it<br />
<br />
'''Greg Stark''': Is there a holistic view of where you want to be spending money when receiving donations? If you are only approving things by a case-by-case basis, you might in a situation where e.g. you are only spending on one thing.<br />
<br />
'''Robert Treat''': There is not a specific agenda as of yet, but that could be discussed.<br />
<br />
'''Robert Treat''': On the PostgreSQL website, there is a page where we list corporate sponsors of the PostgreSQL project. Over the past year, we have gotten our act together. There is a group of five people that should discuss which companies should be listed and not listed per guidelines. We take a group vote to say yes / no. There are only two categories: Major Sponsor, and Sponsor. We have committed to updating it at least every 6 months.<br />
<br />
One of the big goals in general is to create more visibility, so we have added to the website how the group works and when we have last updated the sponsors list.<br />
<br />
====Action Items====<br />
<br />
* Funds group to work on making funds group as a resource more visible through community channels<br />
* Funds group to work on creating a plan for how funds are spent as well as fundraising goals once that is set<br />
<br />
===09:40 - 09:50 13.0 release and commitfest schedule===<br />
<br />
'''Lead by Dave Page'''<br />
<br />
''General consensus is to branch just before the first PostgreSQL 13 commitfest.''<br />
<br />
'''Dave Steele''': The extra commitfest saw about as many commits as any non-final commitfest.<br />
<br />
'''Andrew D''': The July one did not appear to have a much patches committed<br />
<br />
'''Peter Eisentraut''': There were about 50 patches that were otherwise handled.<br />
<br />
'''David Steele''': Patches that are backlogged are often backlogged for a reason. They are no less problematic in July than they are in March. I think the extra CF was useful. The question was was it too much burden on the committers? If committers ok with it?<br />
<br />
'''Dave Page''': Anyone object to doing the same as 12?<br />
<br />
''Consensus is same as last year.''<br />
<br />
'''Andres Freund''': One mildly related issue: I suggest that we do not have RMTs that have non-overlapping timezones.<br />
<br />
====Action Items====<br />
<br />
* Branch for PostgreSQL 13 will occur just before July 2019 commitfest<br />
* Commitfest schedule will be mirror that of PostgreSQL 12 cycle: July 2019, Sep 2019, Nov 2019, Jan 2020, Mar 2020.<br />
<br />
===09:50 - 10:10 Contributor Recognition: Contributors page update; how well is it working? Should the Developer Meeting serve as recognition?===<br />
<br />
'''Lead by Andres Freund'''<br />
<br />
'''Andres Freund''': This came up in a somewhat unrelated discussion was that the developer meeting serves as recognition for some developers who are otherwise not well recognized. Some people suggested it was a good idea, some bad. That made me also wonder how we deal with recognition. I noticed that the developers list was outdated.<br />
<br />
'''Stephen Frost''': One item is that we have a contributors team that is trying to figure this out. It has been unfortunately quiet. IIRC it's Robert Haas., Vik F., Dave Page., and myself.<br />
<br />
'''Robert Haas''': We've had a little bit of email discussion and it tailed off. We need to figure some things out. This process is hard as there is subjectivity about measuring levels of contribution. There is also subjectivity about every other aspect of the process.<br />
<br />
'''Dave Page''': Basically a similar problem with the sponsors, but five times harder.<br />
<br />
'''Robert Haas''': I would be glad to have a few more people contributors team -- would be helpful to have a few more discussions around that. We have enough consensus that we can get some changes done and enough effort that those changes can be made. Things are imperfect, but we will figure it out.<br />
<br />
'''Peter Eisentraut''': I know it's hard work and please keep it up, one sort of scheduling point is that traditionally that these sort of updates have been around this time of year, which probably confuses a lot of people that contributed to Release X, they then have to wait another year to somehow be listed. If you could aim for a 6 months schedule or try to get it updated at least twice a year.<br />
<br />
'''Robert Haas''': If we got a complete update of all of it done once, that would be a good first step. In terms of measuring of whether someone should be listed of a Contributor vs. Major Contributor. There have always been a lot of people that we do not list. We could easily list everyone who has made a contribution to PostgreSQL, and that list could be on the order of 1000 people. The more that we adjust the standard, the more people we could recognize. However, that is a bigger administrative burden.<br />
<br />
''Discussion ensues over process.''<br />
<br />
'''Robert Haas''': There is an issue where people contribute over years and they do not get recognized at all. This is one reason why I started my blog post on code contributions to be able to shed some light on who contributes.<br />
<br />
'''Peter Eisentraut''': Can we have the team to establish a short term target? Set the process in place by X.<br />
<br />
'''Stephen Frost''': Let's set some cadence around the team.<br />
<br />
'''Robert Haas''': +1. We need to make it more clear what factors are taken into consideration.<br />
<br />
'''Dave Page''': One thing we have learned from the sponsors is to not try to codify too much. List general guidelines, but as soon as you try to codify an actual algorithm / scoring criteria, you will be bikeshedding for years etc.<br />
<br />
'''Heikki Linnakangas''': If you do it more often, like 6 months, or even 3 months, you come up with a list of new people and who has contributed in the last few months. It is easier to come up with names who are active or became active recently. You can keep track if they are continuing to do so, and later if they are still active and actively contributing, that's a way to make it work.<br />
<br />
'''Robert Treat''': I used to maintain the list when I maintained it myself. If I knew someone who should be on that list, I would add team. When new committers are being evaluated, and they are not a recognized contributor, they probably should be on that list.<br />
<br />
'''Robert Haas''': Let's at least have a chat for all those who are here.<br />
<br />
'''Dave Page''': Just to explain the original intent: the developer meeting from the very first one that was organized was never intended to be any form of recognition. It was intended that the developers who were the biggest contributors were the ones that were invited.<br />
<br />
'''Andres Freund''': Should we consider removing the list from the wiki?<br />
<br />
'''Peter Geoghegan''': There is an implicit recognition. People will feel recognition.<br />
<br />
''Discussion ensues over publication of RSVP list.''<br />
<br />
'''Andres Freund''': Clear information about the different sections on the contributors website.<br />
<br />
'''Jonathan Katz''': +1; part of content rewrite.<br />
<br />
'''Heikki Linnakangas''': If we can change the theme of each meeting every year and vary the theme, would help with how people perceive the meeting.<br />
<br />
'''Peter Geoghegan''': +1<br />
<br />
''Dave Page provides history of the meeting.''<br />
<br />
====Action Items====<br />
<br />
* Contributors team will work on setting up some guidance into what is taken into consideration on list. Will set up a more regular cadence for evaluation<br />
* Will work on setting some short term goals on getting this done<br />
<br />
'''Dave Page''': Let's move evolving developer meeting agenda item to the end.<br />
<br />
===10:35 - 10:45 Coffee break===<br />
<br />
'''Peter Eisentraut''': We have a new committer as of right now. '''David Rowley'''.<br />
<br />
''Congratulations and huzzahs all around''<br />
<br />
===10:45 - 10:55 SQL standard update===<br />
<br />
'''Lead by Peter Eisentraut'''<br />
<br />
'''Peter Eisentraut''': As of earlier this, I am a member of the SQL Standard Working Group. I had a chance earlier this year to meet people who are active in the group and gave me some starting points on what is actually helpful. Volunteer effort: work is done by people who show up and do the work.<br />
<br />
My goal is to make sure that we are not blindsided by new developments. I felt that SQL:2016 surprised a bunch of people on some of the work, e.g. JSON functionality. I want to keep an eye on what is going on for now. Next update is the plan for 2020 or 2021. If there is more interest, I can organize an unconference session tomorrow to talk about what's going on.<br />
<br />
'''Andres Freund''': Are there other meetings?<br />
<br />
'''Peter Eisentraut''': One in South Korea, not planning to go. Some meetings next year closer to where I live. There is a certain time investment in terms of participating actively. IF anyone has questions, please let me know.<br />
<br />
'''Stephen Frost''': From discussions you've had, any feeling / thought / impressions of PostgreSQL?<br />
<br />
'''Peter Eisentraut''': From people I've discussed they've heard of PostgreSQL and happy we are getting more involved.<br />
<br />
'''John Naylor''': Do other OSS projects have representation there?<br />
<br />
'''Peter Eisentraut''': I don't think so.<br />
<br />
====Action Items====<br />
<br />
* With Peter being on the committee, we will better know what changes are coming into SQL standard that could affect ongoing or planned work<br />
* Exploring what will happen with new committee membership<br />
<br />
===11:00 - 11:10 Locale apocalypse===<br />
<br />
'''Lead by Peter Eisentraut'''<br />
<br />
'''Peter Eisentraut''': In glibc 2.28 which was published Aug 2018, they changed the locale significantly. The Linux distributions with LTS are now starting to come out. The first one was RHEL 8 which came out a few weeks ago. This change will affects such users<br />
<br />
What do we do? It's too late to bake something into PostgreSQL. We can't put a bug fix into glibc. There is a wiki page [[Locale_data_changes|Locale Data Changes]] is that at the very least we send people to it to understand the changes.<br />
<br />
'''Thomas Munro''': A bunch of operating systems have given up on dealing with it and are converging on UCA.<br />
<br />
'''Greg Stark''': Can we do something with point releases?<br />
<br />
'''Peter Eisentraut''': For users, they will be affected by operating system upgrades.<br />
<br />
'''Heikki Linnakangas''': How can users detect the issue?<br />
<br />
'''Peter Eisentraut''': We have put tests on the wiki page. We have also collected the behavior changes across a variety of operating systems where the behavior changes. We advise after upgrade, run amcheck or reindex or both.<br />
<br />
'''Michael Paquier''': Can we make REINDEXDB more smart about that?<br />
<br />
'''Peter Eisentraut''': Potentially, yes.<br />
<br />
'''Thomas Munro''': There is a massive thread about this. Probably an unconference discussion on this.<br />
<br />
'''Peter Geoghegan''': You can use ICU of cousre but only as a per-column location.<br />
<br />
'''Peter Eisentraut''': My goal for PG13 would make ICU global.<br />
<br />
'''Peter Geoghegan''': It would be nice to make it the de facto standard on certain platforms.<br />
<br />
'''Stephen Frost''': Like all of them.<br />
<br />
'''Peter Eisentraut''': Looking forward to discussing more tomorrow.<br />
<br />
====Action Items====<br />
<br />
* Ensure information about the locale issues that can crop up are effectively communicated to users. Have [[Locale_data_changes|Locale Data Changes]] as starting point<br />
* Unconference session about locales<br />
<br />
===11:10 - 11:25 Commitfest Management===<br />
<br />
'''Lead by David Steele'''<br />
<br />
'''David Steele''': First thing I would discuss is the extra commitfest in PG12; sounds like we will continue doing that. Increases overall capacity during the year. The focus on older patches that were at least doable.<br />
<br />
'''We have an ongoing problem that we discussed last year''': we have patches that go on from commitfest etc. and year-to-year. Sometimes it is because the patch itself is not ready; sometime it is because people who cannot get their stuff into e.g. a PG12, they will drop it. It is hard for people to take unmaintained patches seriously.<br />
<br />
We added the feature where we tag CF entries to say for being PG13 even though it was in the CF for PG12. It was both good and confusing. You had to constantly filter to see what we were considering for 12. However, people were less upset when I marked it for PG13 and pushed it to the next commitfest.<br />
<br />
'''Andres Freund''': A lot of the items that got feedback; knowing it was for PG13 was "I can give it review, but may not be as thorough knowing that it wouldn't be for 12."<br />
<br />
'''David Steele''': I would like to say that Andres Freund' triage for the CF items has been invaluable. Been great to compare my notes to his notes; makes it easier to deliver bad news to people when a patch needs to be moved..<br />
<br />
As far as the filtering goes, I would find I would have to munge the filters for the items that I am looking at. It would be nice to have a "not" filter, e.g. "NOT PG13" so I could look at the "current" stuff. Or a filter that is "current" so I can see what is currently what's going on for current / stable view.<br />
<br />
'''Andres Freund''': On the triage item, I'm glad to do it but I would like if other people would triage and as well. It would help be less arbitrary if it was just me. If more people chimed in, triage, and said "Hey it's not ready because XYZ"<br />
<br />
'''David Steele''': Having your triage helped me to move items that did not belong in PG12. I used a different method for tracking the stuff I was looking at this year. There is nothing in the CF app that lets me know if I recently viewed a patch.<br />
<br />
''Discussion ensues about some of the commitfest process and about marking a patch as a work-in-progress.''<br />
<br />
'''David Steele''': Marking as the current version is a declaration of intent, i.e. we are planning to get something into 12.<br />
<br />
''Stephen Frost brings up topic about how we can ensure there is a steady stream of CFs''<br />
<br />
'''Andres Freund''': Perhaps we need to do more to reject patches which could help ease some of the burden of managing commitfests.<br />
<br />
''Discussion ensues about various procedural issues.''<br />
<br />
'''Thomas Munro''': Earlier you were talking about target version. Why are bugs subject to different versions of commitfest items and why should they be subject to certain rules?<br />
<br />
'''Tom Lane''': We basically have that list of bugs there so we don't forget about it.<br />
<br />
''Thomas Munro is volunteered to be CFM''<br />
<br />
'''Jonathan Katz''': Suggest adding these guidelines + recommendations to the wiki: [[Running_a_CommitFest]]<br />
<br />
''Jonathan Katz is happy to help Thomas with CFM''<br />
<br />
'''David Steele''': Noting that last one tends to be most challenging given there are some other challenges beyond just technical aspects.<br />
<br />
====Action Items====<br />
<br />
* Update [[Running_a_CommitFest]] on best practices that David Steele has mentioned.<br />
* Thomas Munro & Jonathan Katz should sync up on commitfest management work<br />
<br />
===11:35 - 11:50 Release notes: Criteria for major version release notes and depth level.===<br />
<br />
'''Lead by Peter Geoghegan and Álvaro Herrera'''<br />
<br />
'''Peter Geoghegan''': There was a consensus against the current model that the current level for technical depth is appropriate, we are after all not as close to users. There are a number of people, including myself, that feel the level of depth of the release notes is insufficient as it does not support the goals of the release notes to provide insight into what they could benefit from, e.g. things that could have regressed workload, etc.<br />
<br />
The inclination of the development community is increasingly towards performance stuff. Perhaps the current model made more sense on its own terms previously, and perhaps the current model could be revisited.<br />
<br />
'''Noah Misch''': Do you have a concrete proposal?<br />
<br />
'''Peter Geoghegan''': I would suggest that there could be more deference given to the views of the author in each case.<br />
<br />
'''David Rowley''': One of the things that gets left out commonly is query planner improvements, and I think it's a problem. Recently I had a customer upgrading from 9.2 => 9.6, and they had a query that was performing better. It had an EXISTS with a LIMIT 1 clause. I remember the patch in 9.5 that affected this. I remembered the commit that was broken, fixed it, changed it etc. I suggested to the customer to review the release notes as well for other changes, but said change was not in the release notes.<br />
<br />
'''Peter Eisentraut''': Other than the length of the release notes, what would be the counter argument for adding more detail?<br />
<br />
'''Robert Haas''': I agree in general. The challenge, as I understand it, is that it is difficult from the commit messages is that it's hard to judge how much detail the users care about and how do you explain that clearly.<br />
<br />
'''Bruce Momjian''': There are two issues I think about in making the release notes in trying to make the release notes digestible to users. I get why all of us would like to see that kind of detail that helps with more technical details. I write the release notes for people who are going to be reading all of the release notes.<br />
<br />
One criteria I use is the length of the notes. The second thing I do is that if I hit three items that I cannot make sense of as a regular user, then I may tune out.<br />
<br />
'''Robert Haas''': The issue is that there is nothing else. The situation David R. is talking about is that it happens. If you cannot find the information in the release notes, you have to look in the commit log. I don't really agree with the idea that we should be afraid of things in the release notes because we are worried that people will not understand them. I worry we lose more that we are not documenting things that are important to people.<br />
<br />
'''Peter Eisentraut''': There are strategies to try to make adding the items more accessible.<br />
<br />
'''Stephen Frost''': We should at least list the things that are included; they may not have a lot of information in the note itself, but we can point to it.<br />
<br />
'''Peter Geoghegan''': I'm fine with it being a discrete item, but the point is that it needs to be discoverable.<br />
<br />
'''Bruce Momjian''': I would also love to see wiki pages on it, so I can discover it myself.<br />
<br />
'''Robert Haas''': These are workarounds for lacking documentation. Example is: planning around partitioning is faster in certain cases. Do we have documentation around which cases?<br />
<br />
Can have drill downs for places to get more information.<br />
<br />
'''Alvaro H''': Another place we can do that is to add additional items to the feature matrix.<br />
<br />
'''Andres Freund''': Need to be careful about use feature matrix for features vs. behavioral changes.<br />
<br />
'''Noah Misch''': Need to ask ourselves are we happy about surprises users might see<br />
<br />
'''Tom Lane''': I'm afraid that a lot of these cases is that the thing the user is wishing to see is an unintended consequence of the release.<br />
<br />
'''Jonathan Katz''': In user days, would definitely re-read major release notes to find things. Can make it easier on pgweb to point out where to find where major incompatibilities changed.<br />
<br />
''Discussion on going back to ensure that old release notes document incompatibilities''<br />
<br />
'''Stephen Frost''': Dedicated area for incompatibilities?<br />
<br />
'''Greg Stark''': Want to ensure we highlight incompatibilities that could surprise the user.<br />
<br />
'''Heikki Linnakangas''': It would be nice to drill down further on the features, wherever it may be. Could make it part of the documentation.<br />
<br />
'''Peter Eisentraut''': Do we want to try to do this for PG12?<br />
<br />
'''Peter Geoghegan''': It's only a matter of displaying information that is already there.<br />
<br />
'''Bruce Momjian''': Show the specific commits around it?<br />
<br />
====Action Items====<br />
<br />
* Research into ways to provide more details within the release notes with hidden description boxes, etc.<br />
* Further discussion on ways to surface incompatibilities between versions<br />
* Look into displaying URLs to commits in release notes, as information is available<br />
<br />
===11:50 - 12:20 The Evolving Developer Meeting: Goals & What We Want to Accomplish by the end of it?===<br />
<br />
'''Lead by Dave Page and Jonathan Katz'''<br />
<br />
'''Dave Page''': This meeting -- where should it go next? What do we want to accomplish?<br />
<br />
'''Peter Eisentraut''': The obvious problem is that we have expanded it, we are running out of time. Should we allocate more time?<br />
<br />
''Discussion ensues about requiring more time and capacity of the meeting.''<br />
<br />
'''Robert Haas''': We could also have a couple of meetings. When we pushed the tech discussion out of this meeting, it pushed it back down to half a day. It allowed more people to be around in technical discussions. It could also help break out special purpose meetings. Could help have more open meetings, e.g. having a contributors meeting that people can openly attend.<br />
<br />
'''Dave Page''': Allows to have updates from other parts of the project so things do not get too compartmentalized.<br />
<br />
'''Jonathan Katz''': Great we have reports: good to see things beyond project. Want to ensure we have actionable follow-ups, the topics brought up seemed to affect across entire community.<br />
<br />
'''Peter Eisentraut''':<br />
<br />
1. Do we want to make the meeting larger<br />
<br />
* 1 General consensus is '''no'''.<br />
<br />
2. Do we want to make it a full day?<br />
<br />
* 2 General consensus is '''yes'''.<br />
<br />
'''Noah Misch''': Technical topics that work well are topics that are well known and have not been solved yet through community methods, e.g. locales or working with bug fixes languishing in commitfest.<br />
<br />
'''Dave Page''': People can always present items for the agenda and can have a discussion on whether it should be in the meeting or be in the unconference.<br />
<br />
We will keep the meeting the same size. We will aim for a six hour meeting. Stephen & I will work on it again and call for volunteers (Andres Freund is voluntold). Include updates from various project areas. Keep them to 10 minutes each. We need to have a deadline to decide when people are invited by.<br />
<br />
====Action Items====<br />
<br />
* Having different parts of the project present and participating in the meeting is good and should continue<br />
* Ensure invites can be sent out by the early New Year 2020.<br />
* Dave Page, Stephen Frost, Andres Freund<br />
<br />
'''Meeting adjourns at 12:20pm.'''<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PgCon_2020_Developer_Meeting&diff=37556PgCon 2020 Developer Meeting2023-02-10T08:41:28Z<p>Alvherre: </p>
<hr />
<div>'''NOTE:''' This meeting is not happening since PGCon 2020 was moved online.<br />
<br />
----<br />
<br />
A meeting of the interested PostgreSQL developers is being planned for Tuesday 26 May, 2020 at the University of Ottawa, prior to pgCon 2020. In order to keep the numbers manageable, this meeting is by '''invitation only'''.<br />
Any questions regarding the invitations to this event should be directed to the team of individuals tasked with coming up with the list of people to invite:<br />
<br />
* Andres Freund<br />
* Stephen Frost<br />
* Dave Page<br />
<br />
The invitation list for the meeting has changed this year to include representatives from various project sub-teams, for example, packagers, the release team, Code of Conduct committee and more.<br />
<br />
An Unconference will be held on Friday for in-depth discussion of technical topics.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Define the schedule for the 14.0 release cycle<br />
* Address any proposed timing, policy, or procedure issues<br />
* Receive updates from project sub-teams on their activities and discuss any resulting issues or concerns.<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will (probably) be:<br />
<br />
* 9:00AM to 3PM<br />
* DMS 3105 - Desmarais Hall, 55 Laurier Avenue East<br />
* University of Ottawa.<br />
<br />
Lunch will be served during the meeting.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname). Note that we can accommodate a '''maximum of 30'''!<br />
<br />
# Andres Freund<br />
# Stephen Frost<br />
# Dave Page<br />
<br />
The following people will not be in Ottawa, and do not plan to attend:<br />
<br />
== Agenda Items ==<br />
<br />
* 14.0 release and commitfest schedule (Dave)<br />
* ''Please add suggestions for agenda items here. (with your name)''<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00am - 09:10am<br />
|Welcome and introductions<br />
|Dave Page<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30am - 11:00am<br />
|Coffee break<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:00pm<br />
|Lunch<br />
|All<br />
<br />
|- <br />
|2:50pm - 3:00pm<br />
|Any other business<br />
|Dave Page<br />
|}<br />
<br />
Note: This timetable is a rough guide only. Items will start as soon as the previous discussion is complete (breaks will not move materially however). Any remaining time before lunch may be used for Commitfest item triage or other activities.<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=FOSDEM/PGDay_2017_Developer_Meeting&diff=37555FOSDEM/PGDay 2017 Developer Meeting2023-02-10T08:40:31Z<p>Alvherre: add category</p>
<hr />
<div>A meeting of the interested PostgreSQL developers is being planned for Thursday 2nd February, 2017 at the Brussels Marriott Hotel, prior to FOSDEM/PGDay 2017. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).<br />
<br />
Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.6 and 10 release cycles. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.<br />
<br />
This is a PostgreSQL Community event.<br />
<br />
== Meeting Goals ==<br />
<br />
* Review the progress of the 10.0 schedule, and formulate plans to address any issues<br />
* Address any proposed timing, policy, or procedure issues<br />
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]<br />
<br />
== Time & Location ==<br />
<br />
The meeting will be:<br />
<br />
* 9:00AM to 5:00PM<br />
* Brussels Marriott Hotel<br />
<br />
Coffee, tea and snacks will be served starting at 8:45am. Lunch will be provided.<br />
<br />
== RSVPs ==<br />
<br />
The following people have RSVPed to the meeting (in alphabetical order, by surname) and will be attending:<br />
<br />
* Oleg Bartunov<br />
* Jeff Davis<br />
* Andrew Dunstan<br />
* Stephen Frost<br />
* Etsuro Fujita<br />
* Magnus Hagander<br />
* Petr Jelinek<br />
* Alexander Korotkov<br />
* Noah Misch<br />
* Bruce Momjian<br />
* Simon Riggs<br />
* Dave Page<br />
* Masahiko Sawada<br />
* Tomas Vondra<br />
<br />
The following people have sent their apologies:<br />
<br />
* Joe Conway<br />
* Dimitri Fontaine<br />
* Peter Geoghegan<br />
* Kyotaro Horiguchi<br />
* Shigeru Hanada<br />
* Amit Kapila<br />
* Tom Lane<br />
* Thomas Munro<br />
* Michael Paquier<br />
* Dean Rasheed<br />
* Craig Ringer<br />
* David Rowley<br />
* Teodor Sigaev<br />
* Heikki Linnakangas<br />
<br />
==Agenda Items==<br />
<br />
Please add agenda items here!<br />
<br />
* Sharding update<br />
<br />
* Setting up the Release Management Team for Postgres 10.0 (Simon)<br />
<br />
* Supporting management roles (aka: removing superuser checks) (Dave)<br />
<br />
* Adding DBA management roles (was Superowners) (Simon)<br />
<br />
* SQL/JSON in SQL-2016 Standard and our roadmap (Oleg)<br />
<br />
* Is it worth having loads of meetings if not everybody attends? (Simon)<br />
<br />
* Tools and services from pginfra (Magnus -- ''if'' others are interested, I don't have any specific entries myself)<br />
<br />
==Agenda==<br />
<br />
{| border="1" cellpadding="4" cellspacing="0"<br />
!Time<br />
!Item<br />
!Presenter<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|09:00 - 09:10<br />
|Welcome and introductions<br />
|Dave<br />
<br />
|- <br />
|09:10 - 09:20<br />
|10.0 Release Review<br />
|All<br />
<br />
|- <br />
|09:20 - 09:45<br />
|Setting up the Release Management Team for Postgres 10.0<br />
|Simon<br />
<br />
|- <br />
|09:45 - 10:00<br />
|Is it worth having loads of meetings if not everybody attends? <br />
|Simon<br />
<br />
|- <br />
|10:00 - 10:30<br />
|Momjian Half Hour<br />
|Bruce<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|10:30 - 11:00<br />
|Coffee break<br />
|All<br />
<br />
|- <br />
|11:00 - 11:30<br />
|SQL/JSON in SQL-2016 Standard and our roadmap<br />
|Oleg<br />
<br />
|- <br />
|11:30 - 12:15<br />
|Supporting management roles (aka: removing superuser checks)<br />
|Dave/Simon<br />
<br />
|- <br />
|12:15 - 12:45<br />
|Tools and services from pginfra<br />
|Magnus<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|12:45 - 13:45<br />
|Lunch<br />
|All<br />
<br />
|- <br />
|13:45 - 14:15<br />
|Performance Farm<br />
|Tomas<br />
<br />
|- <br />
|14:15 - 15:00<br />
|Open CommitFest Item Review<br />
|All<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|15:00 - 15:30<br />
|Tea break<br />
|All<br />
<br />
|- <br />
|15:30 - 17:00<br />
|Open CommitFest Item Review<br />
|All<br />
<br />
|- <br />
|16:45 - 17:00<br />
|Any other business<br />
|Dave<br />
<br />
|- style="font-style:italic;background-color:lightgray;"<br />
|17:00<br />
|Finish<br />
|<br />
|}<br />
<br />
== Minutes ==<br />
<br />
<pre><br />
Welcome<br />
--------<br />
<br />
Magnus: Be it noted that the Quadranteers were on time.<br />
<br />
Present: Oleg Bartunov, Andrew Dunstan, Stephen Frost, Etsuro Fujita, Magnus Hagander, Petr Jelinek, Alexander Korotkov, <br />
Noah Misch, Bruce Momjian, Dave Page, Masahiko Sawada, Tomas Vondra<br />
<br />
Apologies: Simon Riggs (travel issues prevented attendance)<br />
<br />
10.0 Release Review<br />
-------------------<br />
<br />
Dave: The point of this item is to quickly review the current status of the release and note any potential issues.<br />
Magnus: The real question is that we're looking for a September release. Do we still think we can do that?<br />
Noah: I don't see anything that would stop that.<br />
<br />
All agree we're currently on track.<br />
<br />
Setting up the Release Management Team for Postgres 10.0<br />
--------------------------------------------------------<br />
<br />
Noah: Do we want an RMT again, and if so, do we want it to behave any differently from last year?<br />
Stephen: From my perspective it seemed like a good thing and was helpful. How did it seem from the inside.<br />
Noah: It was a lot of not very interesting work.<br />
Bruce: What sort of work?<br />
Noah: Keeping track of items and chasing people. Some of it could be automated perhaps.<br />
Magnus: There's value in personal chasing rather than autmated.<br />
Noah: We also did the scary patch tournament!<br />
Bruce: The big value is that we ensure everything gets done.<br />
Petr: It certainly helps that there are committers on the team, and they can, if needed, just revert a patch.<br />
Stephen: It revolves around the open items list.<br />
Noah: Certainly.<br />
Magnus: Everyone is free to add, RMT removes.<br />
Noah/Petr: People added to the open items list because they realised that's what the RMT are following<br />
Dave: Is there anything that could be automated to ease the process?<br />
Noah: Maybe a dashboard of what needs to be chased today?<br />
Dave: Do we have the info needed for that on the CF app?<br />
Noah: Not really, as we don't track the open items there specifically<br />
Stephen: I have to email the list when I add an open item anyway, so it would be cool if I could have a tag and the CF <br />
app could pick that up.<br />
Petr: What about tracking open items as part of the CF?<br />
Magnus: The workflow for open items really isn't the same as it is for a CF. I'm worried that merging these functions <br />
together will make both processes less optimal.<br />
Noah: Another task is trying to figure out what commit caused an open item, which is not always easy.<br />
Dave: We could link to commits when closing items on the CF app, much like Redmine does<br />
Dave: It seems like we're all in agreement that we want an RMT again.<br />
Noah: Yes, I hear only good things.<br />
Magnus: Who was on the last team?<br />
Noah: Me, Robert Haas and Alvaro.<br />
Magnus: We should not use the same people again to avoid burnout.<br />
Noah: I'd be happy to do it - it's kindof calming.<br />
Dave: It's a good thing to have one person roll over to the next year to build institutional knowledge/experience.<br />
Stephen: I'd like to have a non-committer on the team.<br />
Bruce: Should we have someone outside the Americas?<br />
<discussion on where the center of the world is; Britain of course>.<br />
Noah: Timing isn't a major issue - we don't need every RMT member in close timezones.<br />
Stephen: Members of the RMT need to be very vocal and outspoken. They need to be a trusted voice and willing to deliver<br />
bad news<br />
Dave: So we're agreed we need an RMT again, and Noah is willing to do it again or step down as needed.<br />
Andrew: So the RMT is active from the end of the last commitfest until the release?<br />
Noah: Yeah.<br />
Dave: So if Noah is willing, I'd propose that he takes the lead on forming this years team.<br />
Noah: Ok.<br />
Bruce: Alexander would be good.<br />
Noah: Are you interested?<br />
Alexander: Yes<br />
<br />
TODO: Noah to form RMT.<br />
<br />
Is it worth having loads of meetings if not everybody attends?<br />
--------------------------------------------------------------<br />
<br />
Dave: <describes past developer meetings><br />
Dave: I expect Tokyo to be an exception - once every few years<br />
Bruce: Will there be a Tokyo conference next year?<br />
Etsuro: We'll have an Asia conference, but maybe China or elsehwere.<br />
Noah: The improtant thing is we have a meeting with a critical mass of developers<br />
Stephen: Yes, Ottawa is good for that.<br />
Magnus: Ottawa is good for admin/procedural<br />
Andrew: Should we make Brussels more open, an unconference style?<br />
Magnus: I think that works well at Ottawa because of the large conference as well. We could have an entire open meeting <br />
on patch triage for example though.<br />
Bruce: Who was in Tokyo (about half). That shows maybe geographical distribution may not be an issue.<br />
Magnus: Lists people who went to Tokyo. There was only one person who was in Tokyo who hasn't been in Ottawa or Brussels<br />
Noah: I don't really want to travel that much - I'm only here because I was in Europe anyway.<br />
Bruce: Was Tokyo useful?<br />
Dave: I think it was useful to meet with our Japanese colleagues who we rarely see, but I don't think it was a forum for<br />
making decisions.<br />
Magnus: If we didn't have the Tokyo meeting, maybe we would have had a full agenda today.<br />
Stephen: I think it's useful to have some number of developers talking through designs etc. at multiple conferences.<br />
Dave: So really what you're saying is that we should have technical un-conferences<br />
Bruce: Yeah, or maybe half and half.<br />
Stephen: When I was thinking of coming here, I wasn't thinking so much about the technical content, but perhaps I should.<br />
Magnus: Having the CF review is a good thing, and it doesn't need to be closed.<br />
Bruce: We could have 2 rooms, one for patch triage and one for unconference.<br />
Andrew: For serious triage, you need Toms and Alvaros and so on.<br />
Dave: So, keep Ottawa as it is, and make other dev meetings more unconference/triage events.<br />
Stephen: Right - but we need to ensure senior devs attend.<br />
Bruce: I don't want to preclude procedural discussions at other events though.<br />
Dave: We can always take an unconference session if needed.<br />
Magnus: Or start the unconference an hour later.<br />
<br />
TODO: Dave to investigate options for Brussels/Asia next year.<br />
<br />
Momjian Half Hour<br />
-----------------<br />
<br />
Dave: I'm failing as a moderator as this item now needs to be the Momjian 18 minutes.<br />
Bruce: I'm not sure I even have 5 minutes.<br />
Bruce: I want to recap on sharding that we discussed in Tokyo. Simon said it looked like we had a workable project - and<br />
with the various bits of work on partitioning and pushdown etc. it looks like we'll have something for 10.0, but<br />
not everything.<br />
Petr: Declarative partitioning doesn't work with FDWs yet.<br />
Bruce: Yes, that limits what can be done in 10.0<br />
Noah: What are the projects ongoing that are part of this?<br />
Bruce: I put a blog out after Tokyo that links to the wiki where I'm tracking the various parts of the project: <br />
https://wiki.postgresql.org/wiki/Built-in_Sharding<br />
Noah: Are there patches in the CF right now?<br />
Bruce: Yes, unfortunately they're just sitting there.<br />
Stephen: Which ones?<br />
Bruce: Parallel foreign data push-down<br />
Etsuro: Yes, that's been proposed but noone has reviewed it yet. There's also a transaction manager proposal that's <br />
received no feedback.<br />
Bruce: Whilst there are things waiting, overall I'm very happy with how fast things are going.<br />
Bruce: On security...<br />
Bruce: There's more to it than SSL certs etc - auditing, policy and more - and we don't do enough. We have a mindset in <br />
the community that "if it can't be 100% bulletproof, we won't do it". <br />
Bruce: We need to do much more, and accept that it won't be perfect.<br />
Dave: I don't think we're holding back on things like SSL cert docs for that reason - we can just improve them. On the <br />
other hand, we also know that RLS isn't perfect and has some known covert channels - but we recognise there's <br />
little we can do about that.<br />
Masahiko: We've been doing work on pg_audit.<br />
Stephen: We need to figure out if we can put it in core.<br />
Tomas: It's a similar problem to pg_logical, in that it started out as an extension. We will soon have 3 forks of <br />
pg_audit - which is not good.<br />
Petr: We'll always want more features; we have to understand that the in-core solution might only be 90%.<br />
Tomas: (to Stephen) You should talk to Abhijit if you're interested in making pg_audit in-core again.<br />
Bruce: I think the two areas we're lacking is documentation and auditing.<br />
Stephen: I agree that docs need improvement, but I don't think anyone is saying we shouldn't just do that. I think we <br />
need more than auditing though - some kind of cell based RLS.<br />
<br />
TODO: Bruce to improve docs on SSL certficate setups<br />
TODO: Bruce to complete TODOs from last years meeting.<br />
<br />
SQL/JSON in SQL-2016 Standard and our roadmap<br />
---------------------------------------------<br />
<br />
Oleg: In December ISO released a new SQL/JSON standard. The reason I want to talk is that we need to decide what to do.<br />
Postgres was the first DB with native JSON, and we have many users. Before the standard, we designed what we <br />
wanted, but now, should we move to the SQL/JSON standard? SQL/JSON is not compatible with existing JSON - it looks<br />
like JSONB. It specifies the data model, and originates from Oracle.<br />
Magnus: MySQL were very proud that they had the first compliant database - because they had access to the pre-release<br />
Oracle documentation!<br />
Andrew: What are the differences between standard and our implementation?<br />
Alexander: There are 9 functions like json_path for example, which uses dot syntax instead of slash syntax.<br />
Oleg: Our JSON is just a string - we preserve everything. Our JSONB is binary, which doesn't preserve anything. The new<br />
standard is un-ordered.<br />
Alexander: There's also a naming convention issue. All our JSONB functions start with jsonb_.<br />
Andrew: So it's really just a set of functions. There's no datatype as such.<br />
Stephen: It would be really nice to have support for the standard in core. <br />
Petr: The question is, should we have a new datatype for json_path?<br />
Andrew: Just use a string!<br />
Stephen: Do we have anything already that occupies this space?<br />
Oleg: No.<br />
Alexander: If we take a string, we need a way to cache.<br />
Stephen: If they take a string, then we can overload alternatives.<br />
Magnus: JSONB sounds much better The question is whether we can map the standard to it.<br />
Oleg: No problem.<br />
Andrew: Can you do it in a couple of weeks?<br />
Oleg: The first problem is development. The second is that to review you need a copy of the standard, which we cannot<br />
share.<br />
Andrew: I'll get a copy, and can review.<br />
Oleg: We'll have something for Postgres 10.0. Teodor is very interested - three weeks should not be a problem.<br />
Alexander: The standard is very fixed, but should we allow the user to use catalog functions or operators etc.<br />
Andrew: No, follow the standard for now.<br />
<br />
TODO: PostgresPro team to implement functions, Andrew to review.<br />
<br />
Supporting management roles (aka: removing superuser checks)<br />
------------------------------------------------------------<br />
<br />
<notes taken by Noah><br />
<br />
Some users reject pgadmin & other tools due to superuser access.<br />
dpage wants grantable permission to read log_directory.<br />
PEM/pgadmin uses pg_ls_dir/pg_read_file to read logs<br />
sfrost: could offer pg_read_log instead?<br />
dpage: system also wants to read postgresql.conf<br />
sfrost: risk if everything routes through postgres protocol<br />
spage: PEM agent runs on each server, but some users don't run it.<br />
magnus: do you need postgresql.conf, or is pg_settings enough<br />
dpage: requires superuser access for file-path settings<br />
dpage: want verbatim file to catch unapplied changes<br />
frost/magnus: there's another feature for that<br />
frost: don't want PG to expose this much<br />
nmisch: once you have to run more than "create role", might as well create a<br />
whole database with everything you need<br />
dpage: doesn't want to have to grant pg_catalog objects to some role<br />
nmisch: I think pg_backup is defineable, but hackers did not<br />
dpage PEM needs: pg_tablespace, read lots of GUCs, pg_create_restore_point,<br />
pg_start_backup, pg_stat_activity, ps_ls_dir/pg_read_file OR higher-level replacement<br />
nmisch: would they mind if you create your own database, vs. objects in their database?<br />
dpage: some would mine less, some would still dislike<br />
frost: allowing pg_ls_dir grant invites people to grant too much<br />
magnus: people won't migrate from pg_ls_dir to another interface<br />
frost: let's give people interfaces to the things they need<br />
dpage: had ten years of monitoring running as superuser. need to improve that somehow.<br />
frost: pg_read_file,pg_write_file are sufficiently dangerous<br />
magnus/frost: read file basically gives you superuser via ssl private key,<br />
kerberos keytab, etc<br />
frost: still helpful if you consider this as defense against bugs in monitoring<br />
code, not hostile monitoring code<br />
frost: we could solve the read-all-GUCs case for 10.0.<br />
magnus could add pg_list_logs, pg_ls_xlog<br />
frost: have what-is-oldest-xlog function in backend today? magnus: doubts it<br />
dpage: monitoring wants to track volume of xlog files, not actual file names<br />
frost: keep it correct for nondefault segment size<br />
nmisch: treat xlog as stream of bytes; breaking into files is implementation<br />
detail<br />
frost: still want to reduce number of superuser checks. extension whitelist?<br />
magnus: risk of providing half-way answer is that people never get to full-way<br />
magnus: these individual new functions/roles are low-risk<br />
frost: doesn't want the function reaching to syslog or something<br />
dpage: will this ever lead to default roles?<br />
nmisch: will never have one pg_monitoring role for PEM<br />
magnus: could have N roles that together suffice for PEM.<br />
frost: don't want 100 default roles.<br />
dpage: SHOW has no grant<br />
nmisch: could add a builtin role, similar to pg_signal_backend<br />
frost: riggs proposing patch to give db owner more privileges<br />
nmisch: had been debated in the past, some people strongly in favor of status<br />
quo. petr: people coming from other databases don't like it<br />
frost: wants it more like an owner of all objects. not like local superuser<br />
in particular, allow revoking rights from the owner<br />
magnus, nmisch: won't allow untrusted function creation<br />
frost: gets requests for readonly users (pg_dump role)<br />
magnus: compare db_datareader in sql server<br />
petr: risk of changing database owner rights due to effects on existing DBs<br />
petr/dunstan: riggs wants dbowner to operate as object owner<br />
vondra: what is riggs goal? dunstan: arose from difficulty of bucardo in RDS<br />
frost: break up owner rights, create default role for each. but want it to be<br />
granted at the DB level. considering db-level catalog.<br />
magnus: cluster-level enough for many people<br />
dpage: might write pg_ls_log_dir, still wants pg_ls_wal_dir<br />
nmisch: I think the key thing to resolve is the list of default roles<br />
pgadmin/PEM would need, then acquire consensus that those roles would be<br />
sufficiently well-defined to put in core.<br />
<br />
conclusions:<br />
- welcome specific proposals for new lower-priv interfaces, grantability, default roles<br />
- read-all and write-all proposals welcome<br />
<br />
Tools and services from pginfra<br />
-------------------------------<br />
<br />
Magnus: What stuff do people want? E.g. tracking of open items etc. Let's think about it over lunch and talk afterwards.<br />
<br />
<lunch><br />
<br />
Magnus: So are there any comments on services that need improvements or services we lack?<br />
Petr: I asked about this last year, but the CF app doesn't email me if someone changes something on an item.<br />
Magnus: Huh? If you create comments it will email.<br />
Petr: I also don't hear when the status changes.<br />
Magnus: I definitely get those. We should debug why you don't get them later.<br />
Noah: The -announce list can look pretty polluted in the archives when people reply to items by CC different lists.<br />
Stephen: <pointing to Magnus> HAH!!<br />
Magnus: We need to special case -announce. We don't allow CC's on -announce messages, but the archives work on msgid.<br />
Dave: While we're talking about mailing lists, let's remind everyone that we're changing mailing list software which<br />
will have knock-on effects:<br />
Magnus: No subject mangling, no footers, new namespace. Needed so we can support DKIM.<br />
Noah: So what about archives for closed lists?<br />
Magnus: A separate instance of the archives, with community auth based access.<br />
Andrew: Will there be a web interface?<br />
Magnus: Yes<br />
Dave: It's vastly more user friendly than mj2, and highly simplified.<br />
Stephen: Oh, and we will no longer break signed emails.<br />
Noah: I've noticed that the mbox archives mis-handle git attachments. <br />
Magnus: Yeah, that's mj2.<br />
Dave: Shall we talk about AWS?<br />
Magnus: We have AWS credits that people can use for dev/test. If anyone wants to be guinea pig for the access processes<br />
please let us know (sysadmins@postgresql.org).<br />
<br />
Performance Farm<br />
----------------<br />
<br />
Tomas: If you remember the meeting last year, we decided to build a performance farm, like the build farm. We have the<br />
client basically working and running pgbench and TPC-H/TPC-C. What we don't have now is the server side.<br />
Andrew: I'll be working on it as well. It may be worth roping in Christophe.<br />
Dave: We also have the skeleton website (Django) which ties into community auth etc.<br />
Tomas: The initial version will be designed to run tests following changes. We won't allow execution of arbitrary code.<br />
We want to get more people involved - maybe we should setup a mailing list.<br />
Stephen: How much data will we get? The buildfarm got kinda big.<br />
Andrew: Results should be much smaller - there are no build logs etc.<br />
Tomas: Output from sar is collected, and might be quite big.<br />
Stephen: We need to benchmark the storage requirements so we know how best to host the server. We also need to define<br />
a retention policy.<br />
Dave: We can use RRD-like storage.<br />
Tomas: We can de-dupe consequetive duplicate results.<br />
Stephen: We should think about partitioning.<br />
Tomas: I'm OK with that, but it is likely not compatible with retention times. We need to look at this in more detail <br />
once we now what storage is used by a working system.<br />
Oleg: What about upgraded machines?<br />
Tomas: The client collects a number of stats - we can track them.<br />
Noah: I think the main thing is having lots of clients, so we'll see real regressions on multiple machines.<br />
Dave: The biggest problem is that we're all so busy<br />
Noah: Once the basics are done, that's the hard part, then we can move on to refining things over time.<br />
<br />
TODO: Dave/Tomas/Andrew: Schema design<br />
TODO: Dave/Andrew: Machine registration/admin etc.<br />
TODO: Dave: Mailing list<br />
<br />
Any Other Business<br />
------------------<br />
<br />
None.<br />
</pre><br />
<br />
<br />
[[Category:Developer Meeting]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=Working_with_Git&diff=37443Working with Git2023-01-17T18:06:52Z<p>Alvherre: /* Getting Started */ fix broken links</p>
<hr />
<div>This page collects various wisdom on working with the [https://git.postgresql.org/ PostgreSQL Git repository]. There are also [[Other Git Repositories]] you might work with, most notably the official [https://github.com/postgres Github mirror] which you might fork on that site.<br />
<br />
==Getting Started==<br />
<br />
A simple way to get started might look like this:<br />
<br />
git clone https://git.postgresql.org/git/postgresql.git<br />
cd postgresql<br />
git checkout -b my-cool-feature<br />
$EDITOR<br />
git commit -a<br />
git diff --patience master my-cool-feature > ../my-cool-feature.patch<br />
<br />
Note that <code>git checkout -b my-cool-feature</code> creates a new branch and checks it out at the same time. Typically, you would develop each feature in a separate branch.<br />
<br />
See the documentation and tutorials at https://git-scm.com/doc/ext for a more detailed Git introduction. For an even more detailed lesson, check out [https://git-scm.com/book/en/v2 the Pro Git book] and maybe get a hardcopy to help support the site.<br />
<br />
You may wish to put the following in your .git/info/exclude [[GitExclude]].<br />
Now that the master repository has been converted to git, the standard<br />
.gitignore files should cover all build products, so you don't need<br />
most of what is listed in that reference. You might still want to<br />
exclude *~, tags, and /cscope.out, though.<br />
<br />
=== Keeping your master branch local synchronized ===<br />
<br />
First, add the origin as a remote. You only need to do this once:<br />
<br />
git remote add origin https://git.postgresql.org/git/postgresql.git<br />
<br />
Next, fetch from your public git repository:<br />
<br />
git fetch origin master<br />
<br />
Merge any new patches from your public repository:<br />
<br />
git merge FETCH_HEAD<br />
<br />
Merge in any changes from the main branch:<br />
<br />
git fetch origin master<br />
git merge FETCH_HEAD<br />
<br />
Now check that it still compiles, passes regression, etc. Make sure you've<br />
invoked ./configure, and then:<br />
<br />
make check<br />
make maintainer-clean<br />
<br />
Assuming all that's good, do a dry run.<br />
<br />
git push --dry-run origin master<br />
<br />
If that's happy, push it out to your public repository.<br />
<br />
git push origin master<br />
<br />
If not, fix any merge failures, do an other dry run, and push.<br />
<br />
=== Tracking Other Branches ===<br />
<br />
Lets say you're happy tracking master, but you'd really like to track any one of the other potential branches at git.postgresql.org<br />
<br />
git remote add <super-fun-branch> https://git.postgresql.org/super-fun-branch.git<br />
git fetch super-fun-branch<br />
git checkout super-fun-branch #this will stage your remote branch for a local checkout<br />
git checkout -b super-fun-branch-name #the name can be wahtever you choose<br />
<br />
Now you have a local branch within your local git repo tracking a different branches history. Most importantly, you can now push to that repo if you have to without making an explicit clone to track the history. It's pretty much impossible to not share some common history with the master branch.<br />
<br />
=== Using Back Branches ===<br />
<br />
Since the git repository contains branches for each of the major versions of PostgreSQL, it's easy to work on the latest code from an older version instead of the current one. Here's how you might list the possibilities and checkout an older version:<br />
<br />
git branch -r<br />
git checkout -b REL_15_STABLE origin/REL_15_STABLE<br />
<br />
Note that if you've already checked out and used a later version, you might need to clean up some of the files left behind by it. It's suggested to run:<br />
<br />
make maintainer-clean<br />
<br />
To get rid of as many of those as possible. You might need to delete some files left behind after that anyway before git will allow you to do the checkout (src/interfaces/ecpg/preproc/preproc.y can be a problem with the specific example above).<br />
<br />
=== Testing a patch ===<br />
<br />
This is a typical setup to review a patch text file, as typically sent by e-mail:<br />
<br />
git checkout -b feature-to-review<br />
patch -p1 < feature.patch<br />
<br />
If the patch fails to apply, there will be file.rej files left behind showing the part that didn't apply. If your directory tree is clean of build information, you can easily find these later using:<br />
<br />
git status<br />
<br />
=== Patch cleanup ===<br />
<br />
Patch diff submission works best when the author does a round of self-review of the actual patch--not just the code, but the physical diff file produced. [[Creating Clean Patches]] covers practices commonly used to produce better patch diff output.<br />
<br />
==Publishing Your Work==<br />
<br />
If you develop a feature over a longer period of time, you want to allow for intermediate review. The traditional approach to that has been emailing huge patches around. The more advanced approach that we want to try (see also Peter Eisentraut's [http://petereisentraut.blogspot.com/2008/02/on-patch-review.html blog entry]) is that you push your Git branches to a private area on <code>git.postgresql.org</code>, where others can pull your work, operate on it using the familiar Git tools, and perhaps even send you improvements as Git-formatted patches. See [https://git.postgresql.org/adm/help the git.postgresql.org site] for instructions on how to sign up, and how to use the repository. You may need to eventually create a patch via e-mail as part of officially [[Submitting a Patch]].<br />
<br />
==Pushing New Branches==<br />
<br />
If you create a new branch, generally for a new feature test, you'll need to push it to git.postgresql.org. <br />
<br />
git push origin new_feature_branch<br />
<br />
Note that, if you have a completely blank repository then not even the branch "master" will exist and will need to be pushed.<br />
<br />
If you ''are'' working with the postgresql core code, do NOT casually make up your own branches and push them, without clearing it on the pgsql-hackers list first. Generally, you want to use your private repo area instead.<br />
<br />
==Removing a Branch==<br />
<br />
Once your feature has been committed to the PostgreSQL repository, you can usually remove your local feature branch. This works as follows:<br />
<br />
# switch to a different branch<br />
git checkout master<br />
git branch -D my-cool-feature<br />
<br />
==Working with the users/foo/postgres.git==<br />
<br />
One option while requesting a project at git.postgresql.org is to have a clone of the main postgresql repository.<br />
<br />
That is very nice feature, but how do you sync the upstream code?!<br />
<br />
One method is to create a git clone in your own repository and add a new remote to handle the syncing :<br />
<br />
# clone your repos<br />
git clone ssh://git@git.postgresql.org/users/foo/postgres.git my_postgres<br />
<br />
# add a new remote<br />
git remote add pgmaster https://git.postgresql.org/git/postgresql.git<br />
<br />
# track some old versions<br />
git checkout -b REL8_3_STABLE origin/REL8_3_STABLE<br />
git checkout -b REL8_4_STABLE origin/REL8_4_STABLE<br />
<br />
# change the remote of master and our old versions tracked<br />
git config branch.REL8_3_STABLE.remote pgmaster<br />
git config branch.REL8_4_STABLE.remote pgmaster<br />
git config branch.master.remote pgmaster<br />
<br />
# pull from postgres official git for each branch<br />
# and finally push to origin<br />
git checkout master<br />
git pull<br />
git push origin<br />
git checkout REL8_3_STABLE<br />
git pull<br />
git push origin<br />
git checkout REL8_4_STABLE<br />
git pull<br />
git push origin<br />
<br />
<br />
This way, PostgreSQL is easy to sync for each branch. Pulling from the official and pushing to your own repository.<br />
<br />
Create your own branch and work as usual. Users who have a local clone of the postgresql.git can add your branch in their repository and happily merge, just as you do.<br />
<br />
==Using the Web Interface==<br />
<br />
Try the web interface at https://git.postgresql.org/. It offers browsing, "blame" functionality, snapshots, and other advanced features, and it is much faster than CVSweb. Even if you don't care for Git or version control systems, you will probably enjoy the web interface.<br />
<br />
==RSS Feeds==<br />
<br />
The Git service provides RSS feeds that report about commits to the repositories. Some people may find this to be an alternative to subscribing to the pgsql-committers mailing list. The URL for the RSS feed from the PostgreSQL repository is https://git.postgresql.org/gitweb/?p=postgresql.git;a=rss. Other options are available; they can be found via the [https://git.postgresql.org/ home page] of the web interface.<br />
<br />
==PostgreSQL Style==<br />
<br />
The PostgreSQL source uses 4-character tabs, making the output from <code>git diff</code> look odd. You can fix that by putting this into your.<code>git/config</code> file:<br />
<br />
[core]<br />
pager = less -x4<br />
<br />
==Continuing the "rsync the CVSROOT" workflow==<br />
<br />
Aidan van Dyk {{messageLink|20090602162347.GF23972@yugib.highrise.ca|published a nice tutorial}} on how to keep several branches using a single copy of historical objects. This is roughly equivalent to keeping several checkouts of a rsync'ed copy of CVSROOT, which is what some committers were used to doing with CVS.<br />
<br />
<br />
[[Category:Git]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=PostgreSQL_15_Open_Items&diff=37148PostgreSQL 15 Open Items2022-08-12T13:04:57Z<p>Alvherre: solved: "MERGE fails if inside a CTE"</p>
<hr />
<div>== Open Issues ==<br />
<br />
'''NOTE''': Please place new open items at the end of the list.<br />
<br />
* [https://www.postgresql.org/message-id/20220616233130.rparivafipt6doj3%40alap3.anarazel.de PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size]<br />
** Owner: Andrew Dunstan<br />
* [https://www.postgresql.org/message-id/aada8f97-924e-5661-aead-257aa346899c@enterprisedb.com GROUP BY optimization defeated partitionwise tests]<br />
** [https://www.postgresql.org/message-id/3242058.1659563057%40sss.pgh.pa.us Possibly-related complaint here]<br />
** Owner: Tomas Vondra (db0d67db2)<br />
* [https://www.postgresql.org/message-id/20220802175043.GA13682@telsasoft.com CREATE DATABASE STRATEGY WAL_LOG crash and memory corruption]<br />
** Owner: Robert Haas (9c08aea6a3090a396be334cc58c511edab05776a)<br />
* [https://www.postgresql.org/message-id/CACawEhXwHN3X34FiwoYG8vXR-oyUdrp7qcfRWSzS+NPahS5gSw@mail.gmail.com Materialized view rewrite is broken when there is an event trigger]<br />
** Owner: Michael Paquier (b0483263dda0824cc49e3f8a022dab07e1cdf9a7)<br />
<br />
== Decisions to Recheck Mid-Beta ==<br />
<br />
== Older bugs affecting stable branches ==<br />
<br />
=== Live issues ===<br />
<br />
* [https://www.postgresql.org/message-id/flat/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com RecoveryConflictInterrupt() is unsafe in a signal handler]<br />
** This seems to [https://www.postgresql.org/message-id/447238.1651082925%40sss.pgh.pa.us explain buildfarm failures in 031_recovery_conflict.pl]<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-WzkjjCoq5Y4LeeHJcjYJVxGm3M3SAWZ0%3D6J8K1FPSC9K0w%40mail.gmail.com REINDEX on a system catalog can leave index with two index tuples whose heap TIDs match]<br />
** In other words, there is a rare case where the HOT invariant is violated. Same HOT chain is indexed twice due to confusion about which precise heap tuple should be indexed.<br />
** Unclear what the user impact is.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com memory leak with JIT inlining]<br />
** [https://www.postgresql.org/message-id/flat/20210331040751.GU4431%40telsasoft.com#cc34872765add8e483e05009212d9d39 Another report of (same?) issue and reproducer]<br />
** [https://www.postgresql.org/message-id/flat/9f73e655-14b8-feaf-bd66-c0f506224b9e%40stephans-server.de Another report]<br />
** [https://www.postgresql.org/message-id/flat/16707-f5df308978a55bf8%40postgresql.org Another report]<br />
<br />
* [https://www.postgresql.org/message-id/CAEze2WgGiw%2BLZt%2BvHf8tWqB_6VxeLsMeoAuod0N%3Dij1q17n5pw%40mail.gmail.com Non-replayable WAL records through overflows and >MaxAllocSize lengths]<br />
** In other words; we can write xlog records that we can't read (plus potentially actual WAL corruption); making the instance unrecoverable, and blocks any replication.<br />
** Exploitation seems limited to WAL records of 2PC and logical replication, and extension-generated WAL.<br />
** Affects all stable branches.<br />
<br />
* [https://www.postgresql.org/message-id/flat/dc9dd229-ed30-6c62-4c41-d733ffff776b%40xs4all.nl TOAST fetches could perhaps occur after the needed data has been removed]<br />
** The symptom originally reported in the thread was fixed by {{PgCommitURL|9f4f0a0dad4c7422a97d94e4051c08ec6d181dd6}}, but nobody is very happy with the status quo in this area. Do we need to do more now?<br />
** Affects all stable branches.<br />
<br />
=== Fixed issues ===<br />
<br />
* [https://www.postgresql.org/message-id/CAH2-Wzn22s42h4Lh6v96GsXSKGd%3D_6b76mjqip_WFCGnBmTJCw%40mail.gmail.com CLUSTER sort on abbreviated expressions is broken]<br />
** Affects all stable branches.<br />
** Fixed by: {{PgCommitURL|8ab0ebb9a842dc6063d1374a38b47a3b7ee64afe}}<br />
<br />
* [https://www.postgresql.org/message-id/17485-396609c6925b982d%40postgresql.org Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY]<br />
** Affects v14<br />
** Fixed by: {{PgCommitURL|e28bb885196916b0a3d898ae4f2be0e38108d81b}}<br />
<br />
* [https://www.postgresql.org/message-id/20220519193839.GT19626%40telsasoft.com -c min_dynamic_shared_memory now triggers an assertion]<br />
** Affects v14<br />
** Fixed by: {{PgCommitURL|7201cd1862}}<br />
<br />
* [https://www.postgresql.org/message-id/f8a4105f076544c180a87ef0c4822352%40stmuk.bayern.de Extension pg_trgm, permissions and pg_dump order]<br />
** Affects all stable branches.<br />
** Fixed by {{PgCommitURL|00377b9a02b89a831ae50e1c718d34565356698f}}<br />
<br />
== Non-bugs ==<br />
<br />
== Resolved Issues ==<br />
<br />
=== Resolved before 15beta4 ===<br />
<br />
* [https://www.postgresql.org/message-id/17579-82482cd7b267b862%40postgresql.org MERGE fails if inside a CTE]<br />
** Fixed by: {{PgCommitURL|455d254d22665eb}}<br />
<br />
=== resolved before 15beta3 ===<br />
<br />
* [https://www.postgresql.org/message-id/CAApHDvrHQkiFRHiGiAS-LMOvJN-eK-s762=tVzBz8ZqUea-a_A@mail.gmail.com tuplesort Generation memory contexts don't play nicely with index builds]<br />
** Owner: David Rowley<br />
** Fixed by: {{PgCommitURL|ae1123f9899fe80935ae344e38f18632beb1bf9a}}<br />
* [https://www.postgresql.org/message-id/YrpVkADAY0knF6vM@paquier.xyz Repeatability of installcheck for test_oat_hooks]<br />
** Owner: Andrew Dunstan<br />
** Fixed by: {{PgCommitURL|a6434b951558baad8372dc4b83bf87606dac9cda}}<br />
* [https://www.postgresql.org/message-id/20220530190155.47wr3x2prdwyciah@alap3.anarazel.de Revert debugging added due to 019_replslot_limit]<br />
** Owner: Andres Freund<br />
** Reverted: {{PgCommitURL|3f8148c256e067dc2e8929ed174671ba7dc3339c}}<br />
* [https://www.postgresql.org/message-id/CAApHDvqXpLzav6dUeR5vO_RBh_feHrHMLhigVQXw9jHCyKP9PA%40mail.gmail.com PG15 beta1 sort performance regression due to Generation context change]<br />
** Owner: David Rowley<br />
* [https://www.postgresql.org/message-id/20220706224727.GA2158260@nathanxps13 pg_parameter_aclcheck() and trusted extensions]<br />
** Owner: Tom Lane (a0ffa885e478f5eeacc4e250e35ce25a4740c487)<br />
** Fixed by: {{PgCommitURL|13d83881514856353dc86575eb0fc28132349a60}}<br />
* [https://www.postgresql.org/message-id/YtjsbtZFCaou6C/k@paquier.xyz Unprivileged user can induce crash by using an SUSET param in PGOPTIONS]<br />
** Owner: Tom Lane (a0ffa885e478f5eeacc4e250e35ce25a4740c487)<br />
** Fixed by: {{PgCommitURL|b35617de37870756bdb0e00ffc0a42441e56eefa}}<br />
* [https://www.postgresql.org/message-id/20220726050402.vsr6fmz7rsgpmdz3@jrouhaud wrong filename used in pg_ident_file_mapping infrastructure]<br />
** Owner: Michael Paquier (a2c84990bea7beadb599d02328190e2a763dcb86)<br />
** Fixed by: {{PgCommitURL|27e0ee57f68d27af68967759a2ff61a581f501dc}}<br />
* [https://www.postgresql.org/message-id/17558-3f6599ffcf52fd4a%40postgresql.org Endless loop with UNIQUE NULLS NOT DISTINCT and INSERT ... ON CONFLICT]<br />
** Owner: Peter Eisentraut (94aa7cc5f707712f592885995a28e018c7c80488)<br />
** Fixed by: {{PgCommitURL|d59383924c580a77a2346d9b1284c8589b3d43e2}}<br />
* [https://www.postgresql.org/message-id/PA4P191MB160009A09B9D0624359278CFBA9F9@PA4P191MB1600.EURP191.PROD.OUTLOOK.COM XX000 error caused by window function run conditions]<br />
** Owner: David Rowley<br />
** Fixed by: {{PgCommitURL|270eb4b5d4986534f2d522ebb19f67396d13cf44}}<br />
<br />
* [https://www.postgresql.org/message-id/20220701231413.GI13040@telsasoft.com large objects lost on upgrade]<br />
** Owner: Robert Haas (9a974cbcba005256a19991203583a94b4f9a21a9)<br />
** Fixed by: {{PgCommitURL|bbe08b8869bd29d587f24ef18eb45c7d4d14afca}}<br />
<br />
=== resolved before 15beta2 ===<br />
<br />
* [https://www.postgresql.org/message-id/CA+HiwqGAGobiiHR8nH382HJxqm1mzZs8=3oKPXnXivWoFSZmNA@mail.gmail.com pgbench --partitions=0]<br />
** Owner; Michael Paquier (6f164e6d17616a157ea5d9e34dbb1b211c080c41)<br />
** Fixed by: {{PgCommitURL|27f1366050c6cd8c1ea5f03b367a5a167ebf34b7}}<br />
* [https://www.postgresql.org/message-id/3813350.1652111765%40sss.pgh.pa.us psql now shows zero elapsed time after an error]<br />
** Owner: Peter Eisentraut<br />
** Fixed by: {{PgCommitURL|9520f8d92a8681e441cc863422babd544353dd39}}<br />
* [https://www.postgresql.org/message-id/17495-7ffe2fa0b261b9fa@postgresql.org Regression in 15beta1 when filtering subquery including row_number window function]<br />
** Owner: David Rowley (9d9c02ccd1aea8e9131d8f4edb21bf1687e40782)<br />
** Fixed by: {{PgCommitURL|3e9abd2eb1b1f6863250f060290f514f30ce8044}}<br />
* [https://www.postgresql.org/message-id/20220524235250.gtt3uu5zktfkr4hv@alap3.anarazel.de Safety of subtrans ID caching]<br />
** Owner: Michael Paquier (06f5295af673df795e8e70e28c43d61c2817b6df)<br />
** Fixed by: {{PgCommitURL|b4529005fd387e863bfa9eb863629b1183c0449c}}<br />
* [https://www.postgresql.org/message-id/f80ace33-11fb-1cd3-20f8-98f51d151088@enterprisedb.com pg_upgrade test writes to source directory]<br />
** Owner: Michael Paquier (322becb6085cb92d3708635eea61b45776bf27b6)<br />
** Fixed by: {{PgCommitURL|15b6d2155375dee2fcba072fffa03c1c8b44656c}}<br />
* [https://www.postgresql.org/message-id/77e6ecaa-2785-97aa-f229-4b6e047cbd2b@enterprisedb.com pg_upgrade is not idempotent, even with --check]<br />
** Owner: Michael Paquier (38bfae36526636ef55daf7cf2a3282403587cb5b)<br />
** Fixed by: {{PgCommitURL|4fff78f00910af0137f9de7532f8eb21d08ab1c3}}<br />
* [https://www.postgresql.org/message-id/202204251548.mudq7jbqnh7r@alvherre.pgsql bogus: logical replication rows/cols combinations]<br />
** Owner: Amit Kapila<br />
** Fixed by: {{PgCommitURL|fd0b9dcebda7b931a41ce5c8e86d13f2efd0af2e}}<br />
* [https://www.postgresql.org/message-id/05ebcb44-f383-86e3-4f31-0a97a55634cf%40enterprisedb.com Ignoring BRIN for HOT udpates seems broken]<br />
** Owner: Tomas Vondra (5753d4ee320b)<br />
** Fixed by: {{PgCommitURL|e3fcca0d0d2414f3a50d6fd40eddf48b7df81475}}<br />
* [https://www.postgresql.org/message-id/PAXPR02MB760039506C87A2083AD85575E3DA9%40PAXPR02MB7600.eurprd02.prod.outlook.com psql no longer reports NOTICE messages promptly]<br />
** Owner: Peter Eisentraut (7844c9918)<br />
** Fixed by: {{PgCommitURL|e77de23fbb0f4ef27090c144edcfa889bb2a06d5}}<br />
* [https://www.postgresql.org/message-id/20220517.162719.1671558681467343711.horikyota.ntt@gmail.com amcheck is using a wrong macro to check compressed-ness]<br />
** Owner: Robert Haas (bd807be6935929bdefe74d1258ca08048f0aafa3)<br />
** Fixed by: {{PgCommitURL|e243de03fb4583dd4a9f0afb41493727d7946c02}}<br />
* [https://www.postgresql.org/message-id/20220607154744.vvmitnqhyxrne5ms%40jrouhaud COPY WITH (HEADER MATCH) broken with custom attribute list]<br />
** Owner: Peter Eisentraut (072132f04e55c1c3b0f1a582318da78de7334379)<br />
** Fixed by: {{PgCommitURL|ca7a0d1d368216e89359c63531a4df0b99a437e4}}<br />
* [https://www.postgresql.org/message-id/flat/DM4PR84MB17349C4E7D88A68264C18AF3EED69%40DM4PR84MB1734.NAMPRD84.PROD.OUTLOOK.COM PG15 beta1 fix pg_stats_ext/pg_stats_ext_exprs view manual]<br />
** Tomas Vondra<br />
** Fixed by: {{PgCommitURL|401f623c7b14890011b9bb9dda7639b1de4d40ad}}<br />
* [https://www.postgresql.org/message-id/20220625151930.GH22452@telsasoft.com Incorrect version check for datlocprovider in pg_upgrade]<br />
** Owner: Peter Eisentraut (f2553d43060edb210b36c63187d52a632448e1d2)<br />
** Fixed by: {{PgCommitURL|fa06a34d14ea053e1e405a6ab2a1c3f1631c3a5e}}<br />
* [https://www.postgresql.org/message-id/17522-bfcd5c603b5f4daa@postgresql.org Failure in TAP tests for IP address support in SANs with LibreSSL]<br />
** Owner: Peter Eisentraut (c1932e542863f0f646f005b3492452acc57c7e66)<br />
** Fixed by: {{PgCommitURL|901a9d53011573e45cd7b87682f0520ef3b0fd2d}}<br />
<br />
=== resolved before 15beta1 ===<br />
<br />
* [https://www.postgresql.org/message-id/de57761c-b99b-3435-b0a6-474c72b1149a%40enterprisedb.com libpq: duplicate error message after connection loss]<br />
** Fixed by: {{PgCommitURL|93909599cdba64c8759d646983c0a4ef93de1e50}}<br />
<br />
* [https://www.postgresql.org/message-id/fab3b90a-914d-46a9-beb0-df011ee39ee5%40www.fastmail.com MERGE: ERROR: variable not found in subplan target lists]<br />
** Fixed by: {{PgCommitURL|ce4f46fdc814eb1b704d81640f6d8f03625d0f53}}<br />
<br />
* [https://www.postgresql.org/message-id/20220212211316.GK31460%40telsasoft.com Buildfarm warnings]<br />
** pg_basebackup.c:1261:35: warning: storing the address of local variable archive_filename in progress_filename [-Wdangling-pointer=]<br />
** new in 23a1c6578 - looks like a real error @Robert Haas<br />
** Fixed at: {{PgCommitURL|62cb7427d1e491faf8612a82c2e3711a8cd65422}}<br />
<br />
* [https://www.postgresql.org/message-id/20220311010223.GI28503@telsasoft.com pg_basebackup serverside compression broken with stdout and manifests]<br />
** Fixed at: {{PgCommitURL|b2de45f9200d9adcac50015521574696dc464ccd}}<br />
<br />
* pg_basebackup: bbstreamer_lz4.c:172: bbstreamer_lz4_compressor_content: Assertion `mystreamer->base.bbs_buffer.maxlen >= out_bound' failed. <br />
** [https://www.postgresql.org/message-id/20220316151253.GB28503@telsasoft.com basebackup LZ4 to stdout]<br />
** Owner: Robert Haas (dab298471ff2f91f33bc25bfb73e435d3ab02148)<br />
** Fixed at: {{PgCommitURL|afb529e6772b4e2b065644a2204697eeaf6c9a96}}<br />
<br />
* [https://www.postgresql.org/message-id/CAKFQuwamFuaQHKdhcMt4Gbw5+Hca2UE741B8gOOXoA=TtAd2Yw@mail.gmail.com Incorrect reset timestamp in stats after crash recovery]<br />
** Owner: Andres Freund (5891c7a8ed8f2d3d577e7eea34dacff12d7b6bbd)<br />
** Fixed at: {{PgCommitURL|5cd1c40b3ce9600f129fd1fea9850e1affaf31d5}}<br />
<br />
* [https://www.postgresql.org/message-id/YlPQGNAAa04raObK@paquier.xyz Fixes for compression options of pg_receivewal and refactoring of backup_compression.{c,h}]<br />
** Owner: Michael Paquier (babbbb595d2322da095a1e6703171b3f1f2815cb)<br />
** Fixed at: {{PgCommitURL|042a923ad53dfbe39a9d5012d6c3cf3c9c338884}}<br />
<br />
* [https://www.postgresql.org/message-id/CA+TgmoazKcKUWtqVa0xZqSzbKgTH+X-aw4V7GyLD68EpDLMh8A@mail.gmail.com Remove compatibility from pg_basebackup?]<br />
** Fixed at: {{PgCommitURL|9cd28c2e5f11dfeef64a14035b82e70acead65fd}}<br />
<br />
* [https://www.postgresql.org/message-id/4015413.1649454951%40sss.pgh.pa.us Timing-dependent failure in 002_archiving.pl]<br />
** Owner: Michael Paquier (46dea2419ee7895a4eb3d048317682e6f18a17e1)<br />
** Fixed at: {{PgCommitURL|e61efafcb82c605dcc78f668685223e20d2f7ad8}}, {{PgCommitURL|1a8b110539efe18803c1fa8aa452a2178dbad9a9}}<br />
<br />
* [https://www.postgresql.org/message-id/CA+hUKGJRbzaAOUtBUcjF5hLtaSHnJUqXmtiaLEoi53zeWSizeA@mail.gmail.com qsort performance regression]<br />
** Owner: John Naylor (6974924347c908335607a4a2f252213d58e21b7c)<br />
** Fixed at: {{PgCommitURL|99c754129d787ea4ce3b34b9f4c5f5e74c45ab6a}}<br />
<br />
* [https://www.postgresql.org/message-id/YlZyp26LVVfmwfgW@paquier.xyz Small issues with CLUSTER on partitioned tables]<br />
** Owner: Alvaro Herrera (cfdd03f45e6afc632fbe70519250ec19167d6765)<br />
** Fixed at: {{PgCommitURL|3f19e176ae0f55a653d62e1504dbe5ad8c1006a0}}, {{PgCommitURL|21a10368eb3fce73f146d7e48b4d81496f60d965}}<br />
<br />
* [https://www.postgresql.org/message-id/20220408124338.GK24419@telsasoft.com asynchronous execution crash in trivial_subqueryscan()]<br />
** Owner: Etsuro Fujita (c2bb02bc2e858ba345b8b33f1f3a54628f719d93)<br />
** Fixed at: {{PgCommitURL|5c854e7a2c8a6cd26040e0f9949e7a4a007f6366}}<br />
<br />
* [https://www.postgresql.org/message-id/flat/20220209220004.kb3dgtn2x2k2gtdm%40alap3.anarazel.de Corruption due to relfilenode reuse]<br />
** pg_upgrade can corrupt data with the new OIDs preservation feature<br />
*** Fixed at: {{PgCommitURL|e2f65f42555ff531c6d7c8f151526b4ef7c016f8}}<br />
** the ProcSignalBarrier solution this builds on also turns out to have a small race/hole<br />
*** Fixed at: {{PgCommitURL|b74e94dc27fdbb13954f230b1d1298430afa6c0c}}<br />
** Owner: Thomas Munro, Robert Haas<br />
<br />
* [https://www.postgresql.org/message-id/20220502042718.GB1565149@rfd.leadboat.com Some issues with the TAP tests of pg_upgrade]<br />
** Owner: Michael Paquier<br />
** Fixed at: {{PgCommitURL|7dd3ee508432730d15c5d3032f37362f6b6e4dd8}}<br />
<br />
* [https://www.postgresql.org/message-id/CAMbWs4-LN%3DbF8f9eU2R94dJtF54DfDvBq%2BovqHnOQqbinYDrUw%40mail.gmail.com Crash in _outPathTarget]<br />
** Owner: Peter Eisentraut<br />
** Fixed at: {{PgCommitURL|9ddf251f94090cebf1bd8fc18396cb8a4b580d04}}<br />
<br />
* [https://www.postgresql.org/message-id/flat/Ymd/e5eeZMNAkrXo%40paquier.xyz#23885a148c6899cc874a7bf68f228777 Instability of regression test of pg_walinspect]<br />
** Owner: Jeff Davis<br />
** Fixed at: {{PgCommitURL|ed57cac84d1c5642737dab1e4c4b8cb4f0c4305f}}<br />
<br />
* [https://www.postgresql.org/message-id/YkfeMNYRCGhySKyg%40ahch-to crash with JSON constructors and window functions]<br />
** Owner: Andrew Dunstan (f4fb45d15c59d7add2e1b81a9d477d0119a9691a)<br />
** Fixed at: {{PgCommitURL|4eb9798879680dcc0e3ebb301cf6f925dfa69422}}, {{PgCommitURL|112fdb3528465cc14a2f1dff3dc27f100326d885}}<br />
<br />
* [https://www.postgresql.org/message-id/CAA4eK1LpBFU49Ohbnk%3Ddv_v9YP%2BKqh1%2BSf8i%2B%2B_s-QhD1Gy4Qw%40mail.gmail.com 013_partition.pl failing]<br />
** Fixed at: {{PgCommitURL|dd4ab6fd6528e160571986fa8817cee9f2645aa8}}<br />
<br />
* [https://www.postgresql.org/message-id/Yni6ZHkGotUU+RSf@paquier.xyz Avoid garbage logs with postgres -C on runtime-computed GUCs]<br />
** Fixed at: {{PgCommitURL|8bbf8461a3a2a38ce5f2952a025385b6938a61f7}}<br />
** Owner: Michael Paquier<br />
<br />
* [https://www.postgresql.org/message-id/20220506234924.6mxxotl3xl63db3l@alap3.anarazel.de Some issues with mark_pgdllimport.pl]<br />
** Fixed at: {{PgCommitURL|5edeb574285ecbcc47f0b769a7e363404db0155b}}<br />
** Owner: Robert Haas<br />
<br />
* [https://www.postgresql.org/message-id/1656446.1650043715%40sss.pgh.pa.us Crash in new pgstats code]<br />
** Initially reported issue was fixed by {{PgCommitURL|4a736a161c306fcfed970e6b649f2f03f465ac24}}, but there may be more to do here.<br />
** Owner: Andres Freund<br />
<br />
* [https://www.postgresql.org/message-id/b3463b8c-2328-dcac-0136-af95715493c1%40xs4all.nl TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508]<br />
** Fixed at: {{PgCommitURL|0cf16cb8ca4853b084c40eca310c4c9c3ebf7e2a}}<br />
** Owner: Andres Freund<br />
<br />
* [https://www.postgresql.org/message-id/YlGJGiofZiWN3elx@jrouhaud limitations of GetMaxBackends()]<br />
** Fixed at: {{PgCommitURL|4f2400cb3f10aa79f99fba680c198237da28dd38}}, {{PgCommitURL|ab02d702ef08343fba30d90fdf7df5950063e8c9}}, {{PgCommitURL|7fc0e7de9fb8306e84d1c15211aba4308f694455}}<br />
** Owner: Robert Haas (aa64f23b02924724eafbd9eadbf26d85df30a12b, and 4567596316d186c6e61c72df013797266fcac2f7)<br />
<br />
<br />
<br />
== Won't Fix ==<br />
<br />
* InvokeNamespaceSearchHook calls need to be moved<br />
** [https://www.postgresql.org/message-id/2600348.1647987525%40sss.pgh.pa.us Re: New Object Access Type hooks]<br />
** Problem showed by 90efa2f5565d28054c30c18f6a2f17f94fdff91e.<br />
* [https://www.postgresql.org/message-id/20220603195318.qk4voicqfdhlsnoh@alap3.anarazel.de Reduce amount of logs generated by TAP tests of pg_upgrade?]<br />
** Owner: Michael Paquier<br />
** Other thread: [https://www.postgresql.org/message-id/YrP6ZRXITYWhpVrl@paquier.xyz here]<br />
** The problem is wider than just the upgrade tests, as all the runs of pg_regress would be impacted. We may want a more centralized solution for this older problem.<br />
<br />
== Important Dates ==<br />
<br />
Current schedule:<br />
<br />
* Feature Freeze: April 7, 2022 ('''Last Day to Commit Features''')<br />
* Beta 1: May 19, 2022<br />
* Beta 2: June 30, 2022<br />
* Beta 3: (August 11, 2022)<br />
* GA: TBD<br />
<br />
== See also ==<br />
<br />
* [[Release Management Team]]<br />
<br />
[[Category:Open_Items]]</div>Alvherrehttps://wiki.postgresql.org/index.php?title=User:Alvherre&diff=37118User:Alvherre2022-07-20T14:29:49Z<p>Alvherre: update email address</p>
<hr />
<div><br />
Álvaro Herrera: alvaro.herrera at [https://enterprisedb.com enterprisedb.com]</div>Alvherre