Index-only scans

From PostgreSQL wiki

(Difference between revisions)
Jump to: navigation, search
(+cat)
Line 1: Line 1:
Index-only scans, also called covering indexes, is a performance feature which allows an index to be used to satisfy a query without accessing the heap tuples. This is possible using the visibility map in cases where all tuples on a page are visible to all sessions.
+
Index-only scans are a major performance feature added to Postgres 9.2. They allow certain types of queries to be satisfied just by retrieving data from indexes, and not from tables. This can result in a significant reduction in the amount of I/O necessary to satisfy queries.
  
TODO items include:
+
During a regular index scan, indexes are traversed, like any other tree structure, by comparing a constant against Datums that are stored in the index. Btree-indexed types must satisfy the trichotomy property; that is, the type must follow the reflexive, symmetric and transitive law. Those laws accord with our intuitive understanding of how a type ought to behave anyway, but the fact that an index's physical structure reflects the relative values of Datums actually mandates that these rules be followed by types. Indexes contain what are technically redundant copies of the column data that is indexed.
  
* test speed improvement for single-row lookups
+
Indexes do not contain visibility information. That is, it is not possible to ascertain if any given tuple is visible to the current transaction, which is why it has taken so long for index-only scans to be implemented. Writing an implementation with a cheap but reliable visibility look-aside proved challenging.
* test speed improvement for scans of the entire index (this involves random I/O)
+
** we can't scan the index in physical order like vacuum does
+
* make the visibility map always accurate and crash-safe
+
** see [http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/access/heap/visibilitymap.c;h=ab941debcbe41fab34031b9d84652a1b9c65a950;hb=HEAD#l64 visibilitymap.c], lines 64-86
+
* modify the index API to allow index-only lookups
+
* modify the optimizer to identify when index-only scans are useful
+
** right now large index scans are rarely chosen
+
* modify the executor to process index-only scans
+
  
=== Making the Visibility Map Crash-Safe ===
+
The implementation of the feature disproportionately involved making an existing on-disk structure called the visibility map crash-safe. It was necessary for the structure to reliably (and inexpensively) indicate visibility of index tuples - to do any less would imply the possibility of index-only scans producing incorrect results, which of course would be absolutely unacceptable.
  
Currently, a heap page that has all-visible tuples is marked by vacuum as PD_ALL_VISIBLE and the visibility map (VM) bit is set.  This is currently unlogged, and a crash could require these to be set again.
+
The fact that indexes only contain data that is actually indexed, and not other unindexed columns, naturally precludes using an index-only scan when the other columns are queried (by appearing in a query select list, for example).
  
The complexity is that for index-only scans, the VM bit has meaning, and cannot be incorrectly set (though it can be incorrectly cleared because that would just result in additional heap access).  If both PD_ALL_VISIBLE and the VM bit were to be set, and a crash resulted the VM bit being written to disk, but not the PD_ALL_VISIBLE bit, a later heap access that wrote a conditionally-visible row would not know to clear the VM bit, causing incorrect results for index-only scans. 
+
=== Example queries where index-only scans could be used in principle ===
  
The solution is to WAL log the VM set bit activity.  This will cause full-page writes for the VM page, but this is much less than WAL-logging each heap page because a VM page represents many heap pages.  This requires that the VM page not be written to disk until its VM-set WAL record is fsynced to disk.  Also, during crash recovering, reading the VM-set WAL record would cause both the VM-set and heap PD_ALL_VISIBLE to be set.
+
Assuming that there is some (non-expression) index on a column (typically a primary key)::
  
It is still possible for PD_ALL_VISIBLE to be set and for the VM bit not to be set if the heap page was written to disk before the VM-set WAL record was fsync'ed to disk.  In such cases, vacuum will clean this up, and it is also possible for index-only lookups to correct this as well.
+
  select count(*) from categories;
  
=== Email Threads ===
+
Assuming that there is a composite index on (1st_indexed_col, 2nd_indexed_col)::
  
* [http://archives.postgresql.org/pgsql-patches/2007-10/msg00166.php <nowiki>Re: [HACKERS] Including Snapshot Info with Indexes</nowiki>]
+
  select 1st_indexed_col, 2nd_indexed_col from categories;
* [http://archives.postgresql.org/pgsql-patches/2008-01/msg00049.php <nowiki>Re: [HACKERS] Including Snapshot Info with Indexes</nowiki>]
+
* [http://archives.postgresql.org/pgsql-hackers/2008-06/msg01094.php <nowiki>TODO item: Allow data to be pulled directly from indexes</nowiki>]
+
* [http://archives.postgresql.org/pgsql-hackers/2008-09/msg00003.php <nowiki>Re: [PATCHES] VACUUM Improvements - WIP Patch</nowiki>]
+
* [http://archives.postgresql.org/pgsql-hackers/2010-11/msg00493.php <nowiki>We need index-only scans</nowiki>]
+
* [http://rhaas.blogspot.com/2010/11/index-only-scans.html blog entry]
+
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg02011.php
+
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00013.php
+
* http://archives.postgresql.org/pgsql-hackers/2010-12/msg00294.php
+
* http://archives.postgresql.org/pgsql-hackers/2011-01/msg00375.php
+
* http://archives.postgresql.org/pgsql-hackers/2011-03/msg01323.php
+
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00292.php
+
* http://archives.postgresql.org/pgsql-hackers/2011-05/msg00481.php
+
* http://archives.postgresql.org/pgsql-hackers/2011-06/msg01632.php
+
* [http://archives.postgresql.org/pgsql-hackers/2011-08/msg00545.php Index-only scans patch]
+
  
=== Microfunding ===
+
Postgres 9.2 added the capability of allowing indexed_col op ANY(ARRAY[...]) conditions to be used in plain index scans and index-only scans. Previously, such conditions could only be used in bitmap index scans. For this reason, it is possible to see an index-only scan for these ScalarArrayOpExpr queries::
  
The intention of this section is to add information that can help funding development of the feature. First, a list of people willing to fund part of the cost. Second, developers willing to take the task and hopefully the groups will meet somehow. No-one commits to anything by writing them on the list, except from being contacted.
+
  select indexed_col from categories where indexed_col in (4, 5, 6);
  
==== Funders ====
+
=== Index-only scans and index-access methods ===
  
* Jesper Krogh <jk@novozymes.com>, Novozymes A/S,  2.000+€
+
Index-only scans are not actually limited to scans on btree indexes. SP-GiST operator classes may optionally support index-only scans (naturally, this is only sensible when the index isn't lossy)::
* Jim Nasby <jnasby@enovafinancial.com> / Enova Financial, $2,000+ USD
+
  
==== Developers ====
+
  postgres=# select amname, amcanreturn from pg_am where amcanreturn != 0;
 +
  amname | amcanreturn
 +
  --------+--------------
 +
  btree  | btcanreturn
 +
  spgist | spgcanreturn
 +
  (2 rows)
  
*
+
Support for additional index AMs may follow in a future release of PostgreSQL.
 +
 
 +
=== The Visibility Map (and other relation forks) ===
 +
 
 +
The Visibility Map is a simple data structure associated with every heap relation (table). It is a "relation fork"; an on-disk ancillary file associated with a particular relation (table or index). Note that index relations (that is, indexes) do not have a visibility map associated with them. The visibility map is concerned with tracking which tuples are visible to all transactions at a high level. Tuples from one transaction may or may not be visible to any given other transaction, depending on whether or not their originating transaction actually committed (yet, or ever, if the transaction aborted), and when that occurred relative to our transaction's current snapshot. Note that the exact
 +
behaviour depends on our transaction isolation level. Note also that it is quite possible for one transaction to see one physical tuple/set of values for one logical tuple, while another transaction sees other, distinct values for that same logical tuple, because, in effect, each of the two transaction has a differing idea of what constitutes "now". This is the core idea of MVCC. When there is absolute consensus that all physical tuples in a heap page are visible, the page's corresponding bit may be set.
 +
 
 +
Another relation fork that you may be familiar with is the freespace map. In contrast to the visibility map, there is a FSM for both heap and index relations (with the sole exception of hash index relations, which have none).
 +
 
 +
The purpose of the freespace map is to quickly locate a page with enough free space to hold a tuple to be stored, or to determine if no such page exists and the relation has to be extended.
 +
 
 +
In PostgreSQL 8.4, the current freespace map implementation was added. It made the freespace map an on-disk relation fork. The previous implementation required administrators to guestimate the number of relations, and the required freespace map size for each, so that the freespace map existed only in a fixed allocation of shared memory. This tended to result in wasted space due to undersizing, as the core system's storage manager needlessly extended relations::
 +
 
 +
  [peter@peterlaptop 12935]$ ls -l -h -a
 +
  -rw-------. 1 peter peter 8.0K Sep 28 00:00 12910
 +
  -rw-------. 1 peter peter  24K Sep 28 00:00 12910_fsm
 +
  -rw-------. 1 peter peter 8.0K Sep 28 00:00 12910_vm
 +
  ***SNIP***
 +
 
 +
The FSM is structured as a binary tree <ref>[http://www.postgresql.org/docs/9.2/static/storage-fsm.html]</ref>. There is one leaf node per heap page, with non-leaf nodes stores the maximum amount of free space for any of its children. So, unlike EXPLAIN output's node costs, the values are not cumulative.
 +
 
 +
The visibility map is a simpler structure. There is one bit for each page in the heap relation that the visibility map corresponds to.
 +
 
 +
The primary practical reason for having and maintaining the visibility map is to optimise VACUUM. A set bit indicates that all tuples on the corresponding heap page are known to be visible to all transactions, and therefore that vacuuming the page is unnecessary. Like the new freespace map implementation, the visibility map was added in Postgres 8.4.
 +
 
 +
The visibility map is conservative in that a set bit (1) indicates that all tuples are visible on the page, but an unset bit (0) indicates that that condition may or may not be true <ref>[http://www.postgresql.org/docs/9.2/static/storage-vm.html]</ref>.
 +
 
 +
Crash safety, recovery and the visibility map
 +
---------------------------------------------
 +
 
 +
This involves WAL-logging setting a bit within the visibility map during VACUUM, and taking various special measures during recovery.
 +
 
 +
The Postgres write-ahead log is widely used to ensure crash-safety, but it is also intergral to the built-in Hot Standby/Streaming replication feature.
 +
 
 +
Recovery treats marking a page all-visible as a recovery conflict for snapshots that could still fail to see XIDs on that page.  PostgreSQL may in the future try to soften this, so that the implementation simply forces index scans to do heap fetches in cases where this may be an issue, rather than throwing a hard conflict.
 +
 
 +
=== Covering indexes ===
 +
 
 +
Covering indexes are indexes creating for the express purpose of being used in index-only scans. They typically "cover" more columns than would otherwise make sense for an index, typically columns that are known to be part of particular expensive, frequently executed query's selectlist. PostgreSQL supports using just the first few columns of the index in a regular index scan if that is in the query's qual, so covering indexes need not be completely useless for regular index scans.
 +
 
 +
=== Interaction with HOT ===
 +
 
 +
HOT (Heap-only tuples) is a major performance feature that was added in Postgres 8.3. This allowed UPDATES to rows (which, owing to Postgres's MVCC architecture, are implemented with a deletion and insertion of physical tuples) to only have to create a new physical heap tuple when inserting, and not a new index tuple, if and only if the update did not affect indexed columns.
 +
 
 +
With HOT, it became possible for an index scan to traverse a so-called HOT chain; it could get from the physical index tuple (which would probably have been created by an original INSERT, and related to an earlier version of the logical tuple), to the corresponding physical heap tuple. The heap tuple would itself contain a pointer to the next version of the tuple (that is, the tuple ctid), which might, in turn, have a pointer of its own. The index scan eventually arrives at tuple that is current according to the query's snapshot.
 +
 
 +
HOT also enables opportunistic mini-vacuums, where the HOT chain is "pruned".
 +
 
 +
All told, this performance optimisation has been found to be very valuable, particularly for OLTP workloads. It is quite natural that tuples that are frequently updated are generally not indexed. However, when considering creating a covering index, the need to maximise the number of HOT updates should be carefully weighed.
 +
 
 +
You can monitor the total proportion of HOT updates for each relation using this query::
 +
 
 +
  postgres=# select n_tup_upd, n_tup_hot_upd from pg_stat_user_tables;
 +
 
 +
=== What types of queries may be satisfied by an index-only scan? ===
 +
 
 +
Aside from the obvious restriction that queries cannot reference columns that are not indexed by a single index in order to use an index-only scan, the need to visit the heap where all tuples are not known to be visible is relatively expensive. The planner weighs this factor heavily when considering an index-only scan, and in general the need to ensure that the bulk of the table's tuples have their visibility map bits set is likely to restrict index-only scans' usefulness to queries against infrequently updated tables.
 +
 
 +
It is not necessary for all bits to be set; index-only scans may "visit the heap" if that is necessary. Index-only scans are something of a misnomer, in fact - index mostly scans might be a more appropriate appellation. An explain analyze involving an index-only scan will indicate how frequently that occurred in practice::
 +
 
 +
  postgres=# explain analyze select count(*) from categories;
 +
                                                                  QUERY PLAN
 +
  ------------------------------------------------------------------------------------------------------------------------------------------
 +
  Aggregate  (cost=12.53..12.54 rows=1 width=0) (actual time=0.046..0.046 rows=1 loops=1)
 +
    ->  Index Only Scan using categories_pkey on categories  (cost=0.00..12.49 rows=16 width=0) (actual time=0.018..0.038 rows=16 loops=1)
 +
          Heap Fetches: 16
 +
  Total runtime: 0.108 ms
 +
  (4 rows)
 +
 
 +
As the number of heap "visits" goes up, the planner will eventually conclude that an index-only scan isn't desirable, as it isn't the cheapest possible plan according to its cost model. The value of index-only scans lies wholly in their potential to allow us to elide heap access (if only partially) and minimise I/O.
 +
 
 +
=== Is "count(*)" much faster now? ===
 +
 
 +
A traditional complaint made of PostgreSQL, generally when comparing it unfavourably with MySQL (at least when using the MyIsam storage engine, which doesn't use MVCC) has been "count(*) is slow". Index-only scans *can* be used to satisfy these queries without there being any predicate to limit the number of rows returned, and without forcing an index to be used by specifying that the tuples should be ordered by an indexed column. However, in practice that isn't particularly likely.
 +
 
 +
It is important to realise that the planner is concerned with minimising the total cost of the query. With databases, the cost of I/O typically dominates. For that reason, "count(*) without any qual" queries will only use an index-only scan if the index is significantly smaller than its table. This typically only happens when the table's row width is much wider than some indexes'.
 +
 
 +
=== Why isn't my query using an index-only scan? ===
 +
 
 +
VACUUM does not have any particular tendency to behave more aggressively to facilitate using index-only scans more frequently. While VACUUM can be set to behave more aggressively in various ways, it's far from clear that to do so specifically to make index-only scans occur more frequently represents a sensible course of action.
 +
 
 +
The planner doesn't directly examine the entire visibility map of a relation when considering an index-only scan (however, the executor does maintain a running tally, which is visible in explain analyze output). However, the planner does naturally weigh the proportion of pages which are known visible to all.
 +
 
 +
In Postgres 9.2, statistics are gathered about the proportion of pages that are known all-visible. The pg_class.relallvisible column indicates how many pages are visible (the proportion can be obtained by calculating it as a proportion of pg_class.relpages). These statistics are updated during VACUUM and ANALYZE.
 +
 
 +
Note that it is possible to examine the number of index scans (including index-only scans and bitmap index scans) by examining
 +
pg_stat_user_indexes.idx_scan. If your covering index isn't being used, you're essentially paying for the overhead of maintaining it during writes with no benefit in return. Drop the index!
 +
 
 +
=== Summary ===
 +
 
 +
It is possible for index-only scans to greatly decrease the amount of I/O required to execute some queries. For certain queries, particularly queries that are characteristic of data warehousing (i.e. relatively large amounts of static, infrequently-updated data where reports on historic data is frequently required), they can considerably improve performance. Such queries have been observed to execute anything from twice to twenty times as fast with index-only scans. However, one should bear in mind that:
 +
 
 +
* Index-only scans are opportunistic, in that they take advantage of a pre-existing state of affairs where it happens to be possible to elide heap access. However, the server doesn't make any particular effort to facilitate index-only scans, and it is difficult to recommend a course of action to make index-only scans occur more frequently, except to define covering indexes in response to a measured need (For example, when pg_stat_statements indicates that a disproportionate amount of I/O is being used to execute a query against fairly static data, with a smallish subset of table columns retrieved).
 +
 
 +
* When creating a covering index, the likely effect on HOT updates should be weighed heavily. Are there many HOT updates on the table to begin with? This is a general point of concern, because creating an index may prevent HOT updates from occurring, and because the number of HOT updates is a reasonably good proxy for just how static a table is, and therefore how likely it is that most heap pages are known to be all-visible.
 +
 
 +
* Index-only scans are only used when the planner surmises that that will reduce the total amount of I/O required, according to its imperfect cost-based modelling. This all heavily depends on visibility of tuples, if an index would be used anyway (i.e. how selective a qual is, etc), and if there is actually an index available that could be used by an index-only scan in principle.
  
 
[[Category:Indexes]]
 
[[Category:Indexes]]

Revision as of 22:15, 15 November 2012

Index-only scans are a major performance feature added to Postgres 9.2. They allow certain types of queries to be satisfied just by retrieving data from indexes, and not from tables. This can result in a significant reduction in the amount of I/O necessary to satisfy queries.

During a regular index scan, indexes are traversed, like any other tree structure, by comparing a constant against Datums that are stored in the index. Btree-indexed types must satisfy the trichotomy property; that is, the type must follow the reflexive, symmetric and transitive law. Those laws accord with our intuitive understanding of how a type ought to behave anyway, but the fact that an index's physical structure reflects the relative values of Datums actually mandates that these rules be followed by types. Indexes contain what are technically redundant copies of the column data that is indexed.

Indexes do not contain visibility information. That is, it is not possible to ascertain if any given tuple is visible to the current transaction, which is why it has taken so long for index-only scans to be implemented. Writing an implementation with a cheap but reliable visibility look-aside proved challenging.

The implementation of the feature disproportionately involved making an existing on-disk structure called the visibility map crash-safe. It was necessary for the structure to reliably (and inexpensively) indicate visibility of index tuples - to do any less would imply the possibility of index-only scans producing incorrect results, which of course would be absolutely unacceptable.

The fact that indexes only contain data that is actually indexed, and not other unindexed columns, naturally precludes using an index-only scan when the other columns are queried (by appearing in a query select list, for example).

Contents

Example queries where index-only scans could be used in principle

Assuming that there is some (non-expression) index on a column (typically a primary key)::

 select count(*) from categories;

Assuming that there is a composite index on (1st_indexed_col, 2nd_indexed_col)::

 select 1st_indexed_col, 2nd_indexed_col from categories;

Postgres 9.2 added the capability of allowing indexed_col op ANY(ARRAY[...]) conditions to be used in plain index scans and index-only scans. Previously, such conditions could only be used in bitmap index scans. For this reason, it is possible to see an index-only scan for these ScalarArrayOpExpr queries::

 select indexed_col from categories where indexed_col in (4, 5, 6);

Index-only scans and index-access methods

Index-only scans are not actually limited to scans on btree indexes. SP-GiST operator classes may optionally support index-only scans (naturally, this is only sensible when the index isn't lossy)::

 postgres=# select amname, amcanreturn from pg_am where amcanreturn != 0;
  amname | amcanreturn
 --------+--------------
  btree  | btcanreturn
  spgist | spgcanreturn
 (2 rows)

Support for additional index AMs may follow in a future release of PostgreSQL.

The Visibility Map (and other relation forks)

The Visibility Map is a simple data structure associated with every heap relation (table). It is a "relation fork"; an on-disk ancillary file associated with a particular relation (table or index). Note that index relations (that is, indexes) do not have a visibility map associated with them. The visibility map is concerned with tracking which tuples are visible to all transactions at a high level. Tuples from one transaction may or may not be visible to any given other transaction, depending on whether or not their originating transaction actually committed (yet, or ever, if the transaction aborted), and when that occurred relative to our transaction's current snapshot. Note that the exact behaviour depends on our transaction isolation level. Note also that it is quite possible for one transaction to see one physical tuple/set of values for one logical tuple, while another transaction sees other, distinct values for that same logical tuple, because, in effect, each of the two transaction has a differing idea of what constitutes "now". This is the core idea of MVCC. When there is absolute consensus that all physical tuples in a heap page are visible, the page's corresponding bit may be set.

Another relation fork that you may be familiar with is the freespace map. In contrast to the visibility map, there is a FSM for both heap and index relations (with the sole exception of hash index relations, which have none).

The purpose of the freespace map is to quickly locate a page with enough free space to hold a tuple to be stored, or to determine if no such page exists and the relation has to be extended.

In PostgreSQL 8.4, the current freespace map implementation was added. It made the freespace map an on-disk relation fork. The previous implementation required administrators to guestimate the number of relations, and the required freespace map size for each, so that the freespace map existed only in a fixed allocation of shared memory. This tended to result in wasted space due to undersizing, as the core system's storage manager needlessly extended relations::

 [peter@peterlaptop 12935]$ ls -l -h -a
 -rw-------. 1 peter peter 8.0K Sep 28 00:00 12910
 -rw-------. 1 peter peter  24K Sep 28 00:00 12910_fsm
 -rw-------. 1 peter peter 8.0K Sep 28 00:00 12910_vm
 ***SNIP***

The FSM is structured as a binary tree <ref>[1]</ref>. There is one leaf node per heap page, with non-leaf nodes stores the maximum amount of free space for any of its children. So, unlike EXPLAIN output's node costs, the values are not cumulative.

The visibility map is a simpler structure. There is one bit for each page in the heap relation that the visibility map corresponds to.

The primary practical reason for having and maintaining the visibility map is to optimise VACUUM. A set bit indicates that all tuples on the corresponding heap page are known to be visible to all transactions, and therefore that vacuuming the page is unnecessary. Like the new freespace map implementation, the visibility map was added in Postgres 8.4.

The visibility map is conservative in that a set bit (1) indicates that all tuples are visible on the page, but an unset bit (0) indicates that that condition may or may not be true <ref>[2]</ref>.

Crash safety, recovery and the visibility map


This involves WAL-logging setting a bit within the visibility map during VACUUM, and taking various special measures during recovery.

The Postgres write-ahead log is widely used to ensure crash-safety, but it is also intergral to the built-in Hot Standby/Streaming replication feature.

Recovery treats marking a page all-visible as a recovery conflict for snapshots that could still fail to see XIDs on that page. PostgreSQL may in the future try to soften this, so that the implementation simply forces index scans to do heap fetches in cases where this may be an issue, rather than throwing a hard conflict.

Covering indexes

Covering indexes are indexes creating for the express purpose of being used in index-only scans. They typically "cover" more columns than would otherwise make sense for an index, typically columns that are known to be part of particular expensive, frequently executed query's selectlist. PostgreSQL supports using just the first few columns of the index in a regular index scan if that is in the query's qual, so covering indexes need not be completely useless for regular index scans.

Interaction with HOT

HOT (Heap-only tuples) is a major performance feature that was added in Postgres 8.3. This allowed UPDATES to rows (which, owing to Postgres's MVCC architecture, are implemented with a deletion and insertion of physical tuples) to only have to create a new physical heap tuple when inserting, and not a new index tuple, if and only if the update did not affect indexed columns.

With HOT, it became possible for an index scan to traverse a so-called HOT chain; it could get from the physical index tuple (which would probably have been created by an original INSERT, and related to an earlier version of the logical tuple), to the corresponding physical heap tuple. The heap tuple would itself contain a pointer to the next version of the tuple (that is, the tuple ctid), which might, in turn, have a pointer of its own. The index scan eventually arrives at tuple that is current according to the query's snapshot.

HOT also enables opportunistic mini-vacuums, where the HOT chain is "pruned".

All told, this performance optimisation has been found to be very valuable, particularly for OLTP workloads. It is quite natural that tuples that are frequently updated are generally not indexed. However, when considering creating a covering index, the need to maximise the number of HOT updates should be carefully weighed.

You can monitor the total proportion of HOT updates for each relation using this query::

 postgres=# select n_tup_upd, n_tup_hot_upd from pg_stat_user_tables;

What types of queries may be satisfied by an index-only scan?

Aside from the obvious restriction that queries cannot reference columns that are not indexed by a single index in order to use an index-only scan, the need to visit the heap where all tuples are not known to be visible is relatively expensive. The planner weighs this factor heavily when considering an index-only scan, and in general the need to ensure that the bulk of the table's tuples have their visibility map bits set is likely to restrict index-only scans' usefulness to queries against infrequently updated tables.

It is not necessary for all bits to be set; index-only scans may "visit the heap" if that is necessary. Index-only scans are something of a misnomer, in fact - index mostly scans might be a more appropriate appellation. An explain analyze involving an index-only scan will indicate how frequently that occurred in practice::

 postgres=# explain analyze select count(*) from categories;
                                                                 QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=12.53..12.54 rows=1 width=0) (actual time=0.046..0.046 rows=1 loops=1)
    ->  Index Only Scan using categories_pkey on categories  (cost=0.00..12.49 rows=16 width=0) (actual time=0.018..0.038 rows=16 loops=1)
          Heap Fetches: 16
  Total runtime: 0.108 ms
 (4 rows)

As the number of heap "visits" goes up, the planner will eventually conclude that an index-only scan isn't desirable, as it isn't the cheapest possible plan according to its cost model. The value of index-only scans lies wholly in their potential to allow us to elide heap access (if only partially) and minimise I/O.

Is "count(*)" much faster now?

A traditional complaint made of PostgreSQL, generally when comparing it unfavourably with MySQL (at least when using the MyIsam storage engine, which doesn't use MVCC) has been "count(*) is slow". Index-only scans *can* be used to satisfy these queries without there being any predicate to limit the number of rows returned, and without forcing an index to be used by specifying that the tuples should be ordered by an indexed column. However, in practice that isn't particularly likely.

It is important to realise that the planner is concerned with minimising the total cost of the query. With databases, the cost of I/O typically dominates. For that reason, "count(*) without any qual" queries will only use an index-only scan if the index is significantly smaller than its table. This typically only happens when the table's row width is much wider than some indexes'.

Why isn't my query using an index-only scan?

VACUUM does not have any particular tendency to behave more aggressively to facilitate using index-only scans more frequently. While VACUUM can be set to behave more aggressively in various ways, it's far from clear that to do so specifically to make index-only scans occur more frequently represents a sensible course of action.

The planner doesn't directly examine the entire visibility map of a relation when considering an index-only scan (however, the executor does maintain a running tally, which is visible in explain analyze output). However, the planner does naturally weigh the proportion of pages which are known visible to all.

In Postgres 9.2, statistics are gathered about the proportion of pages that are known all-visible. The pg_class.relallvisible column indicates how many pages are visible (the proportion can be obtained by calculating it as a proportion of pg_class.relpages). These statistics are updated during VACUUM and ANALYZE.

Note that it is possible to examine the number of index scans (including index-only scans and bitmap index scans) by examining pg_stat_user_indexes.idx_scan. If your covering index isn't being used, you're essentially paying for the overhead of maintaining it during writes with no benefit in return. Drop the index!

Summary

It is possible for index-only scans to greatly decrease the amount of I/O required to execute some queries. For certain queries, particularly queries that are characteristic of data warehousing (i.e. relatively large amounts of static, infrequently-updated data where reports on historic data is frequently required), they can considerably improve performance. Such queries have been observed to execute anything from twice to twenty times as fast with index-only scans. However, one should bear in mind that:

  • Index-only scans are opportunistic, in that they take advantage of a pre-existing state of affairs where it happens to be possible to elide heap access. However, the server doesn't make any particular effort to facilitate index-only scans, and it is difficult to recommend a course of action to make index-only scans occur more frequently, except to define covering indexes in response to a measured need (For example, when pg_stat_statements indicates that a disproportionate amount of I/O is being used to execute a query against fairly static data, with a smallish subset of table columns retrieved).
  • When creating a covering index, the likely effect on HOT updates should be weighed heavily. Are there many HOT updates on the table to begin with? This is a general point of concern, because creating an index may prevent HOT updates from occurring, and because the number of HOT updates is a reasonably good proxy for just how static a table is, and therefore how likely it is that most heap pages are known to be all-visible.
  • Index-only scans are only used when the planner surmises that that will reduce the total amount of I/O required, according to its imperfect cost-based modelling. This all heavily depends on visibility of tuples, if an index would be used anyway (i.e. how selective a qual is, etc), and if there is actually an index available that could be used by an index-only scan in principle.
Personal tools