Pgconfdev2026 Feedback-based Query Optimization

From PostgreSQL wiki
Jump to navigationJump to search

Planner Feedback

How can we teach the planner to learn from past mistakes?

Session Details

  • pgconf.dev 2026
  • Friday May 22, 2026 10:30AM PDT to 11:20AM PDT
  • Unconference session
  • Proposed and conducted by Alexandra Wang

Paraphrased Transcript

Thoughts/Preface from Robert Haas

Based on experience from pg_plan_advice.

He thinks that plan advice should incude cardinality hints.

Example: a join turns out to be much bigger than estimated. Can we capture that feedback to use as a hint on the next use of that query.

Tomas Vondra

Continuing on the example of the under-estimate example.

He feels that it's basically impossible for the planner to overcome bad row estimtes.

Alex Wang

Relates experience from current work on Join Statistics, which got her thinking about what stats could be collected during execution, and what information would make a difference.

She laments that the postmortem collection of stats for future executions is currently very manual.

Bruno viera de Silva

Wonders about the current limit on the number of stats buckets that can be retained and could this be dynamically adjusted.

Mark Dilger

Real numbers are degenerate case of a distribution, and we actually want the distribution, not the numbers.

Talks about relative risk of plans, and can we set the planner to tolerate not-quite-optimal plans if it means less risk of a terrible plan

Vladimir Churyukin

Previous queries are of a poor indicator of future queries so collecting on fly can be misleading. Just make the existing stats better.

Tomas Vondra

Model will have a hard time if it can't recognize correlations of data.

Stressed that there are so many variables to this problem that it's difficult to discuss it holistically, we should narrow our efforts to a very small area (a particular node type) becuase that makes progress possible.

Alexandra Wang

Recaps how Join Statistics is an example of what Tomas was just talking about.

Tomas Vondra

Generally agrees, tries to reframe in ways that other efforts could have the same limited scope.

Mark Dilger

Preach.

Stu Hood

Likes the join statistics efforts. Is interested in automatically collecing stats on any joins that happen in actual queries, without necessarily waiting for our interest in them to be declared.

Jacob Park

Suggested another group that we could reach out to see if their work overlaps with this.

Robert Haas

Thinks Join Stats are promising, and likes stats-based solutions over hint based learned solutions. He spoke to pg_plans_advsr who did do some convergence stats collection, but didn't share details.

Disucssion then moved to AQO and why it keeps turning up in coversations about this.

Alena Rybakina

Explains the purpose of AQO and talked about additional follow-on work after that. Resorted to partial execution of the query, then fed that back to the optimizer to re-plan the query. After a few iterations an acceptable plan was reached, but this was concerning queries that previously wouldn't not finish, not finding the optimal plan for the query. Then described the limitations of that methodology.

Haibo Yan

Identifying the most selective join in a series of joins.

Alexandra Wang

Asked questions about the partial execution solution.

Alena Rybakina

Went into more detail about a scenario of a query underestimation.

Alexandra Wang

Polled the Committers in the room about whether partial execution was a viable core solution.

Robert Haas

Partial execution feels more like an extension to him. Went into details about the process of maintaining such a thing.

Alexandra Wang

Refined her question - do we already have the infrastructure to support a partial execution extension?

Robert Haas

Enumerated several likely compoents that he knows aren't there now, but didn't rule out adding them.

Thomas Munro

Had a question about how partial execution works in a production environment. Consensus was that is more about training than ad-hoc.

Jeff Davis

Asked about the existence of academic papers on this topic.

Alexandra Wang

Yes. Listed the high level concept of a few papers. Noted a lack of papers on cost model adjustments.

Jeff Davis

Asked Robert what his thoughts were based on, research-wise.

Robert Haas

Not a lot of papers, would definitely like to see more of them. His experience was more baed on the needs that led to pg_plan_advice and it's own direct concern.

Alena Rybakina

Talks about SkinerDB, and its method for refining query plan.

_(Robert signals 5 minutes remaining in session)_

Tomas Vondra

Observes that many research papers out there are using very old versions of postgres or forks of postgres and thus aren't as useful as they could be. He suggests that doing our own proofs of concept (i.e. our own research) might be the better path going forward.

Alexandra Wang

Asked for clarification about the existing planner hooks and what might be missing

Robert Haas

Does not think there's enough there for tweaking cardinality for a baserel. Talks about areas where he's uncertain of what is visible/tallied, but can see that there would be a need for that.

Haibo Yan

Suggest splitting this into two efforts: observability and actually making use of what was observed.

Session Concluded