Statistics the planner doesn't have that it really needs

From PostgreSQL wiki
Jump to navigationJump to search

The planner/optimizer is a voracious consumer of information, no matter how much we feed it it always wants more and better data. Some things we really need to make better decisions:

  • "clusteredness" metric to replace the use of "correlation" for estimating how much random access and cache hit rate for index scans

on multiple columns. Currently we assume they're independent which can lead to overly optimistic estimates.

probably requires scanning much larger samples to get good data. A good algorithm was [? published].