> On Tue, Nov 25, 2014 at 3:44 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > Today, I had a talk with Hanada-san to clarify which can be a common
> > portion of them and how to implement it. Then, we concluded both of
> > features can be shared most of the infrastructure.
> > Let me put an introduction of join replacement by foreign-/custom-scan
> below.
> >
> > Its overall design intends to inject foreign-/custom-scan node instead
> > of the built-in join logic (based on the estimated cost). From the
> > viewpoint of core backend, it looks like a sub-query scan that
> > contains relations join internally.
> >
> > What we need to do is below:
> >
> > (1) Add a hook add_paths_to_joinrel()
> > It gives extensions (including FDW drivers and custom-scan providers)
> > chance to add alternative paths towards a particular join of
> > relations, using ForeignScanPath or CustomScanPath, if it can run instead
> of the built-in ones.
> >
> > (2) Informs the core backend varno/varattno mapping One thing we need
> > to pay attention is, foreign-/custom-scan node that performs instead
> > of the built-in join node must return mixture of values come from both
> > relations. In case when FDW driver fetch a remote record (also, fetch
> > a record computed by external computing resource), the most reasonable
> > way is to store it on ecxt_scantuple of ExprContext, then kicks
> > projection with varnode that references this slot.
> > It needs an infrastructure that tracks relationship between original
> > varnode and the alternative varno/varattno. We thought, it shall be
> > mapped to INDEX_VAR and a virtual attribute number to reference
> > ecxt_scantuple naturally, and this infrastructure is quite helpful for
> both of ForegnScan/CustomScan.
> > We'd like to add List *fdw_varmap/*custom_varmap variable to both of plan
> nodes.
> > It contains list of the original Var node that shall be mapped on the
> > position according to the list index. (e.g, the first varnode is
> > varno=INDEX_VAR and
> > varattno=1)
> >
> > (3) Reverse mapping on EXPLAIN
> > For EXPLAIN support, above varnode on the pseudo relation scan needed
> > to be solved. All we need to do is initialization of dpns->inner_tlist
> > on
> > set_deparse_planstate() according to the above mapping.
> >
> > (4) case of scanrelid == 0
> > To skip open/close (foreign) tables, we need to have a mark to
> > introduce the backend not to initialize the scan node according to
> > table definition, but according to the pseudo varnodes list.
> > As earlier custom-scan patch doing, scanrelid == 0 is a
> > straightforward mark to show the scan node is not combined with a
> particular real relation.
> > So, it also need to add special case handling around foreign-/custom-scan
> code.
> >
> > We expect above changes are enough small to implement basic join
> > push-down functionality (that does not involves external computing of
> > complicated expression node), but valuable to support in v9.5.
> >
> > Please comment on the proposition above.
> I don't really have any technical comments on this design right at the moment,
> but I think it's an important area where PostgreSQL needs to make some
> progress sooner rather than later, so I hope that we can get something
> committed in time for 9.5.
I tried to implement the interface portion, as attached.
Hanada-san may be under development of postgres_fdw based on this interface
definition towards the next commit fest.

Overall design of this patch is identical with what I described above.
It intends to allow extensions (FDW driver or custom-scan provider) to
replace a join by a foreign/custom-scan which internally contains a result
set of relations join externally computed. It looks like a relation scan
on the pseudo relation.

One we need to pay attention is, how setrefs.c fixes up varno/varattno
unlike regular join structure. I could find IndexOnlyScan already has
similar infrastructure that redirect references of varnode to a certain
column on ecxt_scantuple of ExprContext using a pair of INDEX_VAR and
alternative varattno.

This patch put a new field: fdw_ps_tlist of ForeignScan, and
custom_ps_tlist of CustomScan. It is extension's role to set a pseudo-
scan target-list (so, ps_tlist) of the foreign/custom-scan that replaced
a join.
If it is not NIL, set_plan_refs() takes another strategy to fix up them.
It calls fix_upper_expr() to map varnodes of expression-list on INDEX_VAR
according to the ps_tlist, then extension is expected to put values/isnull
pair on ss_ScanTupleSlot of scan-state according to the ps_tlist preliminary

Regarding to the primary hook to add alternative foreign/custom-scan
path instead of built-in join paths, I added the following hook on

/* Hook for plugins to get control in add_paths_to_joinrel() */
typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
List *restrictlist,
JoinType jointype,
SpecialJoinInfo *sjinfo,
SemiAntiJoinFactors *semifactors,
Relids param_source_rels,
Relids extra_lateral_rels);
extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;

It shall give enough information for extensions to determine whether
it can offer alternative paths, or not.

One thing I concerned about is, fdw_handler to be called on joinrel is
not obvious, unlike custom-scan that hold reference to CustomScanMethods,
because joinrel is not managed by any FDW drivers.
So, I had to add "Oid fdw_handler" field onto RelOptInfo to track which
foreign-tables are involved in this relation join. This field shall have
oid of valid FDW handler if both inner/outer relation is managed by
same FDW handler. Elsewhere, InvalidOid. Even if either/both of them are
relations-join, fdw_handler shall be set as long as it is managed by
same FDW handler. It allows to replace join by foreign-scan that involves
more than two tables.

One new interface contract is case of scanrelid == 0. If foreign-/custom-
scan is not associated with a particular relation, ExecInitXXX() tries to
initialize ss_ScanTupleSlot according to the ps_tlist, and relations is
not opened.

Because the working example is still under development, this patch is
not tested/validated yet. However, it briefly implements the concept of
what we'd like to enhance foreign-/custom-scan functionality.

NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

