Re: Hooks to Modify Execution Flow and Query Planner

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Vincent Mirian <vince(dot)mirian(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Hooks to Modify Execution Flow and Query Planner
Date: 2018-11-01 08:56:55
Message-ID: 410b99fe-e26f-9e59-c91f-8a2492497e3f@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018/11/01 16:58, Vincent Mirian wrote:
> Hi all,
>
> I would like to create a library with UDFs written in C that implements
> different Query Planner tasks (e.g. scan, hash, join, etc...). I am looking
> for a document that provides an overview of execution flow within postgres
> and the query planner. I am also looking for a description of the software
> data structures and interfaces used.

Maybe, the following chapter in Postgres documentation will help as a start:

https://www.postgresql.org/docs/11/static/overview.html

For studying internal data structures and interfaces, you can also read
the comments contained in the source code and README files containing
descriptions of various data structures and interfaces, which is a often
recommended method.

Specifically, if you just want to inject alternative plan nodes for the
individual scan, hash, join operators needed to compute a query, but want
the Postgres query planner to take care of the whole planning itself, you
might consider looking into the Custom Scan Provider facility:

https://www.postgresql.org/docs/current/static/custom-scan.html

With it, you can write C code that gets invoked at certain points during
planning and execution, where you can add your special/custom node to a
plan and do execution related tasks on those nodes, respectively. With
this approach, Postgres planner and executor take care of most of the
details of planning and execution, whereas your code implements the
specialized logic you developed for, say, scanning a disk file, joining
two or more tables, building a hash table from the data read from a table,
etc.

You can alternatively formulate your code as a foreign data wrapper if all
you want do is model a non-Postgres data source as regular Postgres tables.

https://www.postgresql.org/docs/11/static/fdwhandler.html

If you don't intend to add new plan nodes or define a new type of foreign
table, but want to alter the planning or execution itself (or parts
thereof), you may want to look at various planner and executor hooks. For
example, if you want to replace the whole planner, which takes a parsed
query (the Query struct) and returns a plan (the PlannedStmt struct whose
internals you'll need to figure out if your alternative planning code can
produce a valid one), you can use the following hook:

/* Hook for plugins to get control in planner() */
typedef PlannedStmt *(*planner_hook_type) (Query *parse,
int cursorOptions,
ParamListInfo boundParams);

But that may be too complex a hook to implement on your own, so you can
look at more granular hooks which allow certain points within the
planning, such as:

/* Hook for plugins to get control in set_rel_pathlist() */
typedef void (*set_rel_pathlist_hook_type) (PlannerInfo *root,
RelOptInfo *rel,
Index rti,
RangeTblEntry *rte);

/* Hook for plugins to get control in add_paths_to_joinrel() */
typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra);

/* Hook for plugins to replace standard_join_search() */
typedef RelOptInfo *(*join_search_hook_type) (PlannerInfo *root,
int levels_needed,
List *initial_rels);

/* Hook for plugins to get control in get_relation_info() */
typedef void (*get_relation_info_hook_type) (PlannerInfo *root,
Oid relationObjectId,
bool inhparent,
RelOptInfo *rel);

On the executor side, you got:

/* Hook for plugins to get control in ExecutorStart() */
typedef void (*ExecutorStart_hook_type) (QueryDesc *queryDesc, int eflags);

/* Hook for plugins to get control in ExecutorRun() */
typedef void (*ExecutorRun_hook_type) (QueryDesc *queryDesc,
ScanDirection direction,
uint64 count,
bool execute_once);

/* Hook for plugins to get control in ExecutorFinish() */
typedef void (*ExecutorFinish_hook_type) (QueryDesc *queryDesc);

/* Hook for plugins to get control in ExecutorEnd() */
typedef void (*ExecutorEnd_hook_type) (QueryDesc *queryDesc);

/* Hook for plugins to get control in ExecCheckRTPerms() */
typedef bool (*ExecutorCheckPerms_hook_type) (List *, bool);

If you can be more specific about the what exactly you're trying to do,
someone can give even better advice.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2018-11-01 09:05:38 Re: Ordered Partitioned Table Scans
Previous Message Erik Rijkers 2018-11-01 08:30:36 Re: row filtering for logical replication