From PostgreSQL wiki
Revision as of 09:50, 11 November 2013 by Kaigai (talk | contribs) (CustomScan Provider)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


CustomScan APIs allow extensions to override a part of executor's job by custom implementations. This feature can help to implement various features (like remote join, GPU acceleration, cache-only scan, etc...) as a regular extension without its own patch to the vanilla PostgreSQL.

Extension has to follow a few steps to consolidate its own logic with core planner / executor as follows.

  • Registration of custom-scan provider (CSP).
  • Add alternative paths using CustomPath structure on add_scan_path or add_join_path hooks
  • Construct CustomScan plan according to the CustomPath above, if planner chooses it.
  • Begin, execute and end the CustomScan according to invocations from the core executor, via registered callback.

Even though overall design is similar to FDW, the biggest difference is that CustomScan does not have a particular table definition, thus, it cannot be a target relation being referenced in user's query, however, it can have arbitrary tuple-descriptor for tuples to be returned.

CustomScan Fig01.png

The above figure shows overall processing flow.

First of all, extension registers a custom-scan provider (CSP) using register_custom_provider(). Usually, it shall be done on _PG_init(), in addition to registration of add_scan_path_hook and add_join_path_hook. A CSP has a set of callback functions and a unique name to be identified. This name is used to tell planner / executors which CSP is assumed on construction of CustomPath structure.

Next, the registered callback shall be invoked during look-up of cheapest scan path or join path. If it is a reasonable case for the extension to execute this scan of join by itself, it shall construct CustomPath structure with appropriate cost estimation, and add it using add_path().

Next, if CustomPath is choosen, the planner construct CustomScan node according to the path, and calls InitCustomScanPlan() callback to initialize CustomScan node by extension itself.

These are the jobs by planner, the following jobs are handled by executor.

Then, the executor construct CustomScanState node according to the plan, and calls BeginCustomScan() to initialize CustomScanState node by extension itself. Usually, it sets up private information, opens underlying relation, assigns tuple slot for scanning and so on.

Next, the executor calls ExecCustomScan() callback multiple times when upper executor node requires next tuple. It shall put a next record on its result tuple slot, or return NULL to terminate scan.

Last, the executor calls EndCustomScan() callback to release private resources allocated by extension.

Data Structure


typedef struct CustomPath
   Path        path;
   const char *custom_name;        /* name of custom scan provider */
   int         custom_flags;       /* CUSTOM__* flags in nodeCustom.h */
   List       *custom_private;     /* can be used for private data */
} CustomPath;

CustomPath structure extends Path, so estimated costs shall be put here. The custom_name is identifier of the name of registered custom-scan provider. The custom_flags is a set of CUSTOM__* flags below, to tell planner its supporting mode. The custom_private is a private data for arbitrary usage by custom-scan provider. (I'm uncertain whether it should be really safe to copyObject. void * might make sense.)


typedef struct CustomScan
   Scan        scan;

   const char *custom_name;        /* name of custom scan provider */
   int         custom_flags;       /* a set of CUSTOM__* flags */
   List       *custom_private;     /* private data for CSP  */
   List       *custom_exprs;       /* expressions that CSP may execute */

   Plan       *subqry_plan;        /* valid, if RTE_SUBQUERY */
   Node       *funcexpr;           /* valid, if RTE_FUNCTION */
} CustomScan;

CustomScan structure extends Scan, so it inherits scanrelid, cost parameters, left- and right- tree and etc from Scan and Plan structure that is a supper class of Scan. The role of custom_name, custom_flags and custom_private are same as CustomPath is doing. The custom_exprs is a list of expressions being moved from the scan.plan.qual, to evaluate them by custom-scan provider itself. The subqry_plan and funcexpr are expressions nodes to be referenced in EXPLAIN command, if scan.scanrelid references RTE for sub-query or function scan. The custom-scan provider can reference these node in their own way, but should not modify.


typedef struct CustomScanState
   ScanState   ss;

   /* use struct pointer to avoid including nodeCustom.h here */
   struct CustomProvider *custom_provider;
   int         custom_flags;
   void       *custom_state;
} CustomScanState;

CustomScanState structure extends ScanState, so it inherits all the properties of them. The custom_provider is a reference to the registered one being identified by custom_name in CustomScan. The custom_flags is same as CustomScan's one. The custom_state is a private state information to be initialized at ExecInitCustomScan().


The flags below can be set on custom_flags to inform the planner expected behavior of the custom-scan node. CSP shall be responsible to the functionality according to the flags.

It tells planner this CustomScan node support ExecMarkPos / ExecRestorePos.
Custom scan provider shall be responsible to mark and restore a particular position.
It tells planner this CustomScan node support backward scan.
Custom scan provider shall be responsible to implement backward scan.

Related Hooks

When planner tries to construct a best execution plan, it compares multiple candidate paths to get the cheapest way to scan a particular relation, or cheapest combination to join two relations from candidates.

The hooks below allows extensions to add alternative scan paths in addition to the paths supported by built-in implementation.


typedef void (*add_scan_path_hook_type)(PlannerInfo *root,
                                        RelOptInfo *baserel,
                                        RangeTblEntry *rte);
extern PGDLLIMPORT add_scan_path_hook_type add_scan_path_hook;

This hook is called to add alternative paths on base relations (tables, functions, values, ...). If extension has available paths instead of built-in scan logic, it can add its CustomPath,


typedef void (*add_join_path_hook_type)(PlannerInfo *root,
                                        RelOptInfo *joinrel,
                                        RelOptInfo *outerrel,
                                        RelOptInfo *innerrel,
                                        JoinType jointype,
                                        SpecialJoinInfo *sjinfo,
                                        List *restrictlist,
                                        List *mergeclause_list,
                                        SemiAntiJoinFactors *semifactors,
                                        Relids param_source_rels,
                                        Relids extra_lateral_rels);
extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;

This hook is called to add alternative paths on join relations with particular outer- and inner- relation. If extension has available paths instead of built-in join logic, it can add its CustomPath that performs like a scan on materialized join result.

CustomScan Provider


typedef struct CustomProvider
   char                            name[NAMEDATALEN];

   InitCustomScanPlan_function     InitCustomScanPlan;
   SetPlanRefCustomScan_function   SetPlanRefCustomScan;

   BeginCustomScan_function        BeginCustomScan;
   ExecCustomScan_function         ExecCustomScan;
   MultiExecCustomScan_function    MultiExecCustomScan;
   EndCustomScan_function          EndCustomScan;

   ReScanCustomScan_function       ReScanCustomScan;
   ExecMarkPosCustomScan_function  ExecMarkPosCustomScan;
   ExecRestorePosCustom_function   ExecRestorePosCustom;

   ExplainCustomScan_function      ExplainCustomScan;
} CustomProvider;

void register_custom_provider(const CustomProvider *provider);

This function registers a set of callback functions being named as a custom-scan provider. It is usually done in _PG_init() of extension, with registration of add_scan_path_hook and add_join_path_hook.


void InitCustomScanPlan(PlannerInfo *root,
                        CustomScan *cscan_plan,
                        CustomPath *cscan_path,
                        List *tlist,
                        List *scan_clauses);

Once a CustomPath got chosen, the planner creates a CustomScan object being initialized according to the CustomPath. This callback allows CSP to have final initialization of CustomScan. At least, it has to set target-list and qualifier of this plan with suitable pre-processing; like categorization of remote and local qualifiers.

SetPlanRefCustomScan (optional)

void SetPlanRefCustomScan(PlannerInfo *root,
                          CustomScan *cscan_plan,
                          int rtoffset);

It gives CSP a chance to adjust varno of Var nodes in target-list on setrefs.c. In case when scanrelid is valid, that means this CustomScan node scans a regular relation as other built-in scan node doing, this callback is optional because set_plan_refs adjusts varno by rtoffset in taraget-list, scan qualifiers and custom_exprs list as other scan node doing.

Elsewhere, a case like CustomScan instead of built-in join, it has to be handled correctly, because Var node associated with this custom-scan has varno on either right or left relations, however, CustomScan returns a tuple being already joined thus its attribute number needed to be adjusted by itself.

A useful stuff here is CUSTOM_VARNO for the new varno to be adjusted. It references a tuple in ss.ss_ScanTupleSlot and shows attribute name according to ss.ss_currentScanDesc on EXPLAIN command.


void BeginCustomScan(CustomScanState *csstate, int eflags)

It allows CSP to have final initialization of the provided CustomScanState object according to the plan. When this callback is called, the executor already creates a CustomScanState object and have basic initialization, like initialization of expression context on target-list or qualifiers, assignment of result (or scan) tuple slot, opening the target table with suitable lock level if scanrelid is valid, and so on. CSP is responsible to initialize its private stuff. Usually, its private state is saved on custom_state field. In case when custom-scan replaced a join, it needs to assign scan tuple slot by itself since the core executor cannot handle it well.


TupleTableSlot *ExecCustomScan(CustomScanState *csstate);

It requires CSP to return a tuple on ps_ResultTupleSlot, or NULL if no more tuples can be returned. It usually calls ExecScan() with its own access method and recheck methods. The access method will copy fetched data (by its own way) onto the result tuple slot, then ExecScan() evaluate local qualifiers.


Node *MultiExecCustomScan(CustomScanState *csstate);

It requires CSP to return multiple tuples according to expectation of the parent node.


void EndCustomScan(CustomScanState *csstate);

It gives CSP a chance to release all the execution time resources. All CSP needs to release are resources acquired by itself. The resources acquired by core executor (like, execution memory context or relation descriptor if scanrelid is valid) shall be released by core itself.


void ReScanCustomScan(CustomScanState *csstate);

It requires CSP to reset scan position. Probably, we might have a flag to tell planner whether CSP support this feature or not.

ExecMarkPosCustomScan (optional)

void ExecMarkPosCustomScan(CustomScanState *csstate);

It makes CSP save the current scan position on somewhere. Unless CSP uses ss_currentScanDesc of ScanState, we have no special storage for this purpose. So, CSP has to save the original position on its private state.

ExecRestorePosCustom (optional)

void ExecRestorePosCustom(CustomScanState *csstate);

It makes CSP restore the current scan position being saved.

ExplainCustomScan (optional)

void ExplainCustomScan(CustomScanState *csstate,
                       ExplainState *es);

It allows CSP to show additional information on EXPLAIN command.

Expected Usage

remote join on postgres_fdw
It allows postgres_fdw to replace a local join between two remote relation by a custom scan that fetches results of remote join query, instead.
inequality operator on ctid reference
It allows to start sequential scan from the particular record, or terminate. Thus, it help to reduce number of tuples to be processed.
GPU accelerated scan
It allows to off-load calculations around sequential scan, or table joins.
Cache only database scan
It allows to scan on in-memory database cache if and when all the referenced columns are cached.