CustomScanInterface

From PostgreSQL wiki
Jump to navigationJump to search

Writing A Custom Scan Provider

Overview

Prior to query execution, the PostgreSQL planner constructs a plan tree that usually consists of built-in plan nodes (IE: SeqScan, HashJoin, etc). The custom-scan interface allows extensions to create a custom-scan provider that implements its own logic, in addition to the built-in nodes, for scanning relations. If a custom-scan node is chosen by the planner, callback functions associated with this custom-scan node shall be invoked during query execution. The custom-scan provider is responsible for returning equivalent result set as built-in logic would, but it is free to scan the relation according to its own logic.

This chapter explains how to write a custom-scan provider.

Interaction with Planner

Planner queries extension whether it can provide alternative scan path using a hook; set_rel_pathlist_hook for relation scan, during a plan construction. This invocation gives a couple of information for extension to determine whether it can offer alternative scan path on the particular relation, then it can arbitrarily add custom-path on RelOptInfo of the relation to be scanned. Custom-scan provider is responsible to set reasonable cost estimation on the CustomPath node; that is the only way for the core planner to determine which Path shall be chosen. The logic how built-in scan-path may be a good example.

Once a custom-path got chosen by planner, custom-scan provider has to populate a CustomScan structure according to the custom-path. The CustomScan structure has two special fields to keep private information; custom_exprs and custom_private. The custom_exprs is intended to save a couple of expression trees that shall be updated on setrefs.c and subselect.c. custom_private is intended to save private information nobody will touch except for the custom-scan provider itself. A plan-tree, which contains custom-scan node, can be duplicated using copyObject(), so anything stored in these two fields must support copyObject().

Interaction with Executor

Once a plan-tree is moved to the executor, it constructs plan-state objects according to the plan-node. Custom-scan is not an exception. Executor invokes a callback to populate CustomScanState node, if it found CustomScan node in the supplied plan-tree. Unlike CustomScan node, it does not have fields to save private information. Instead of these fields, custom-scan provider can allocate larger object than CustomScanState even though its header layout is compatible with CustomScanState. It looks like a relationship of ScanState structure towards PlanState; that expands scan specific fields towards generic plan-state. In addition, custom-scan provider can expand fields on demand.

Once a CustomScanState gets constructed, BeginCustomScan is invoked during executor initialization; ExecCustomScan is repeatedly called during execution (returning a TupleTableSlot with each fetched record), then EndCustomScan is invoked on cleanup of the executor.

Hooks to add custom-paths

Here is two hooks for extensions to add custom-paths, according to the expected tasks.

typedef void (*set_rel_pathlist_hook_type) (PlannerInfo *root,
                                            RelOptInfo *rel,
                                            Index rti,
                                            RangeTblEntry *rte);
extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;

This hook is invoked when the planner investigates the potential scan paths on a particular relation. The custom-scan provider can add custom-path on the supplied relation if it can offer alternative scan paths, using add_paths().

typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
                                             RelOptInfo *joinrel,
                                             RelOptInfo *outerrel,
                                             RelOptInfo *innerrel,
                                             List *restrictlist,
                                             JoinType jointype,
                                             SpecialJoinInfo *sjinfo,
                                             SemiAntiJoinFactors *semifactors,
                                             Relids param_source_rels,
                                             Relids extra_lateral_rels);
extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;

This hook is invoked when the planner investigates potential join paths on a particular relations join. The custom-scan provider can add custom-path on the supplied join-relation if it can offer alternative scan path, using add_paths(). From the viewpoint of the executor, it looks to a relation scan but on a pseudo relation that is materialized from the multiple relations. Custom-scan provider is expected to process relations join with its own logic internally, then return tuple records according to the tuple-descriptor of this scan node.

Functions of CustomPathMethods

A CustomPathMethods table contains a set of callbacks related to CustomPath node. The core backend invokes these callbacks during query planning.

Plan *(*PlanCustomPath) (PlannerInfo *root,
                         RelOptInfo *rel,
                         CustomPath *best_path,
                         List *tlist,
                         List *clauses);

This callback is invoked when the core backend tries to populate CustomScan node according to the supplied CustomPath node. The custom-scan provider is responsible to allocate a CustomScan node and initialize each fields of them.


void (*TextOutCustomPath) (StringInfo str,
                           const CustomPath *node);

This optional callback will be invoked when nodeToString() tries to create a text representation of CustomPath node. A custom-scan provider can utilize this callback, if it wants to output something additional. Note that expression nodes linked to custom_private shall be transformed to text representation by the core, so nothing to do by extension.

CustomScan structure

PlanCustomPath method will construct a custom-plan object, if planner considered the path is enough reasonable. CustomScan is the only custom plan node type that we can populate from the custom-path node at this moment. It has a few key fields to control the behavior, to be set and initialized by custom-scan provider.

scanrelid is a sub-field of Scan structure within CustomScan. Usually, it is a positive index number of range-table entries to clarify which relation is associated with this scan node. However, CustomScan (and ForeignScan) node has special meaning if scanrelid is 0. In this case, CustomScan node is not associated with a certain physical relation, therefore, it performs a scan on pseudo-relation.

flags gives the planner hint about expected behavior of this custom-scan node. CUSTOMPATH_SUPPORT_BACKWARD_SCAN introduce the custom-scan node supports backward scan. CUSTOMPATH_SUPPORT_MARK_RESTORE introduce the custom-scan node supports mark and restore position during execution.

custom_ps_tlist is a list of TargetEntry to inform the core planner/executor expected data type of the records to be returned. Executor initializes CustomScanState node according to the pseudo-scan tlist, even if it is not associated with a physical relation.

Functions of CustomScanMethods

A CustomPathMethods table contains a set of callbacks related to CustomPath node, then the core backend invokes these callbacks during query planning.

Node *(*CreateCustomScanState) (CustomScan *cscan);

This callback shall be invoked when the core backend tries to populate CustomScanState node according to the supplied CustomScan node. The custom-scan provider is responsible to allocate a CustomScanState (or its own data-type enhanced from it), but no need to initialize the fields here, because ExecInitCustomScan initializes the fields in CustomScanState, then BeginCustomScan shall be kicked on the end of executor initialization.

void (*TextOutCustomScan) (StringInfo str,
                           const CustomScan *node);

This optional callback shall be invoked when nodeToString() tries to make text representation of CustomScan node. Custom-scan provider can utilize this callback, if it wants to output something additional. Note that expression nodes linked to custom_exprs and custom_private are transformed to text representation by the core and it is not allowed to expand the data-type by extension, thus, we usually don't need to implement this callback.

Functions of CustomExecMethods

void (*BeginCustomScan) (CustomScanState *node,
                         EState *estate,
                         int eflags);

This callback allows a custom-scan provider to initialize the CustomScanState node. The supplied CustomScanState is partially initialized according to the scanrelid of CustomScan node. If the custom-scan provider wants to apply additional initialization to the private fields, it can be done by this callback.

TupleTableSlot *(*ExecCustomScan) (CustomScanState *node);

This callback requires custom-scan provider to produce the next tuple of the relation scan. If any tuples, it should set it on the ps_resultxxx then returns the tuple-slot. Elsewhere, NULL or empty slot should be returned to inform the upper node end of relation scan.

void (*EndCustomScan) (CustomScanState *node);

This callback allows a custom-scan provider to cleanup the CustomScanState node. If it holds any private (and not released automatically) resources on the supplied CustomScanState, it can release these resources prior to the cleanup of the common portion.

void (*ReScanCustomScan) (CustomScanState *node);

This callback requires custom-scan provider to rewind the current scan position to the head of relation. Custom-scan provider is expected to reset its internal state to restart the relation scan again.

void (*MarkPosCustomScan) (CustomScanState *node);

This optional callback requires custom-scan provider to save the current scan position on its internal state. It shall be able to restore the position using RestrPosCustomScan callback. It shall be never called without CUSTOMPATH_SUPPORT_MARK_RESTORE flag.

void (*RestrPosCustomScan) (CustomScanState *node);

This optional callback requires custom-scan provider to restore the previous scan position that was saved by MarkPosCustomScan callback. It shall be never called without CUSTOMPATH_SUPPORT_MARK_RESTORE flag .

void (*ExplainCustomScan) (CustomScanState *node,
                           List *ancestors,
                           ExplainState *es);

This optional callback allows custom-scan provider to output additional information on EXPLAIN command that involves custom-scan node. Note that it can output common items; target-list, qualifiers, relation to be scanned. So, it can be used when custom-scan provider wants to show something others in addition to the items.