Re: Foreign join pushdown vs EvalPlanQual

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, 花田茂 <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: Foreign join pushdown vs EvalPlanQual
Date: 2015-08-25 01:18:30
Message-ID: 9A28C8860F777E439AA12E8AEA7694F801137623@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujita-san,

How about your opinion towards the solution?
CF:Sep will start next week, so I'd like to make a consensus of
the direction, at least.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> Sent: Thursday, August 13, 2015 10:13 AM
> To: Etsuro Fujita; Robert Haas
> Cc: PostgreSQL-development; 花田茂
> Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
>
> Fujita-san,
>
> The attached patch enhanced the FDW interface according to the direction
> below (but not tested yet).
>
> >> In the summary, the following three enhancements are a straightforward
> >> way to fix up the problem he reported.
> >> 1. Add a special path to call recheckMtd in ExecScanFetch if scanrelid==0
> >> 2. Add a callback of FDW in ForeignRecheck() - to construct a record
> >> according to the fdw_scan_tlist definition and to evaluate its
> >> visibility, or to evaluate qualifier pushed-down if base relation.
> >> 3. Add List *fdw_paths in ForeignPath like custom_paths of CustomPaths,
> >> to construct plan nodes for EPQ evaluation.
>
> Likely, what you need to do are...
> 1. Save the alternative path on fdw_paths when foreign join push-down.
> GetForeignJoinPaths() may be called multiple times towards a particular
> joinrel according to the combination of innerrel/outerrel.
> RelOptInfo->fdw_private allows to avoid construction of same remote
> join path multiple times. On the second or later invocation, it may be
> a good tactics to reference cheapest_startup_path and replace the saved
> one if later invocation have cheaper one, prior to exit.
> 2. Save the alternative Plan nodes on fdw_plans or lefttree/righttree
> somewhere you like at the GetForeignPlan()
> 3. Makes BeginForeignScan() to call ExecInitNode() towards the plan node
> saved at (2), then save the PlanState on fdw_ps, lefttree/righttree,
> or somewhere private area if not displayed on EXPLAIN.
> 4. Implement ForeignRecheck() routine. If scanrelid==0, it kicks the
> planstate node saved at (3) to generate tuple slot. Then, call the
> ExecQual() to check qualifiers being pushed down.
> 5. Makes EndForeignScab() to call ExecEndNode() towards the PlanState
> saved at (3).
>
> I never think above steps are "too" complicated for people who can write
> FDW drivers. It is what developer usually does.
>
> Thanks,
> --
> NEC Business Creation Division / PG-Strom Project
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
>
>
> > -----Original Message-----
> > From: Kaigai Kouhei(海外 浩平)
> > Sent: Wednesday, August 12, 2015 11:17 PM
> > To: 'Etsuro Fujita'; Robert Haas
> > Cc: PostgreSQL-development; 花田茂
> > Subject: RE: [HACKERS] Foreign join pushdown vs EvalPlanQual
> >
> > > -----Original Message-----
> > > From: pgsql-hackers-owner(at)postgresql(dot)org
> > > [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Etsuro Fujita
> > > Sent: Wednesday, August 12, 2015 8:26 PM
> > > To: Robert Haas; Kaigai Kouhei(海外 浩平)
> > > Cc: PostgreSQL-development; 花田茂
> > > Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> > >
> > > On 2015/08/12 7:21, Robert Haas wrote:
> > > > On Fri, Aug 7, 2015 at 3:37 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > > >>> I could have a discussion with Fujita-san about this topic.
> > > >>>
> > > >> Also, let me share with the discussion towards entire solution.
> > > >>
> > > >> The primitive reason of this problem is, Scan node with scanrelid==0
> > > >> represents a relation join that can involve multiple relations, thus,
> > > >> its TupleDesc of the records will not fit base relations, however,
> > > >> ExecScanFetch() was not updated when scanrelid==0 gets supported.
> > > >>
> > > >> FDW/CSP on behalf of the Scan node with scanrelid==0 are responsible
> > > >> to generate records according to the fdw_/custom_scan_tlist that
> > > >> reflects the definition of relation join, and only FDW/CSP know how
> > > >> to combine these base relations.
> > > >> In addition, host-side expressions (like Plan->qual) are initialized
> > > >> to reference the records generated by FDW/CSP, so the least invasive
> > > >> approach is to allow FDW/CSP to have own logic to recheck, I think.
> > > >>
> > > >> Below is the structure of ExecScanFetch().
> > > >>
> > > >> ExecScanFetch(ScanState *node,
> > > >> ExecScanAccessMtd accessMtd,
> > > >> ExecScanRecheckMtd recheckMtd)
> > > >> {
> > > >> EState *estate = node->ps.state;
> > > >>
> > > >> if (estate->es_epqTuple != NULL)
> > > >> {
> > > >> /*
> > > >> * We are inside an EvalPlanQual recheck. Return the test tuple
> > > if
> > > >> * one is available, after rechecking any
> access-method-specific
> > > >> * conditions.
> > > >> */
> > > >> Index scanrelid = ((Scan *) node->ps.plan)->scanrelid;
> > > >>
> > > >> Assert(scanrelid > 0);
> > > >> if (estate->es_epqTupleSet[scanrelid - 1])
> > > >> {
> > > >> TupleTableSlot *slot = node->ss_ScanTupleSlot;
> > > >> :
> > > >> return slot;
> > > >> }
> > > >> }
> > > >> return (*accessMtd) (node);
> > > >> }
> > > >>
> > > >> When we are inside of EPQ, it fetches a tuple in es_epqTuple[] array and
> > > >> checks its visibility (ForeignRecheck() always say 'yep, it is visible'),
> > > >> then ExecScan() applies its qualifiers by ExecQual().
> > > >> So, as long as FDW/CSP can return a record that satisfies the TupleDesc
> > > >> of this relation, made by the tuples in es_epqTuple[] array, rest of the
> > > >> code paths are common.
> > > >>
> > > >> I have an idea to solve the problem.
> > > >> It adds recheckMtd() call if scanrelid==0 just before the assertion above,
> > > >> and add a callback of FDW on ForeignRecheck().
> > > >> The role of this new callback is to set up the supplied TupleTableSlot
> > > >> and check its visibility, but does not define how to do this.
> > > >> It is arbitrarily by FDW driver, like invocation of alternative plan
> > > >> consists of only built-in logic.
> > > >>
> > > >> Invocation of alternative plan is one of the most feasible way to
> > > >> implement EPQ logic on FDW, so I think FDW also needs a mechanism
> > > >> that takes child path-nodes like custom_paths in CustomPath node.
> > > >> Once a valid path node is linked to this list, createplan.c transform
> > > >> them to relevant plan node, then FDW can initialize and invoke this
> > > >> plan node during execution, like ForeignRecheck().
> > > >>
> > > >> This design can solve another problem Fujita-san has also mentioned.
> > > >> If scan qualifier is pushed-down to the remote query and its expression
> > > >> node is saved in the private area of ForeignScan, the callback on
> > > >> ForeignRecheck() can evaluate the qualifier by itself. (Note that only
> > > >> FDW driver can know where and how expression node being pushed-down
> > > >> is saved in the private area.)
> > > >>
> > > >> In the summary, the following three enhancements are a straightforward
> > > >> way to fix up the problem he reported.
> > > >> 1. Add a special path to call recheckMtd in ExecScanFetch if scanrelid==0
> > > >> 2. Add a callback of FDW in ForeignRecheck() - to construct a record
> > > >> according to the fdw_scan_tlist definition and to evaluate its
> > > >> visibility, or to evaluate qualifier pushed-down if base relation.
> > > >> 3. Add List *fdw_paths in ForeignPath like custom_paths of CustomPaths,
> > > >> to construct plan nodes for EPQ evaluation.
> > >
> > > > I'm not an expert in this area, but this plan does not seem unreasonable
> to
> > > me.
> > >
> > > IIRC the discussion with KaiGai-san, I think that that would work. I
> > > think that that would be more suitable for CSPs, though. Correct me if
> > > I'm wrong, KaiGai-san. In either case, I'm not sure that the idea of
> > > transferring both processing to a single callback routine hooked in
> > > ForeignRecheck is a good idea: (a) check to see if the test tuple for
> > > each component foreign table satisfies the remote qual condition and (b)
> > > check to see if those tuples satisfy the remote join condition. I think
> > > that that would be too complicated, probably making the callback routine
> > > bug-prone. So, I'd still propose that *the core* processes (a) and (b)
> > > *separately*.
> > >
> > > * As for (a), the core checks the remote qual condition as in [1].
> > >
> > > * As for (b), the core executes an alternative subplan locally if inside
> > > an EPQ recheck. The subplan is created as described in [2].
> > >
> > I don't think it is "too" complicated because (a) visibility check of
> > the base tuples (saved in es_epqTuple[]) shall be done in the underlying
> > base foreign-scan node, executed as a part of alternative plan, and
> > (b) evaluation of remote qual is done with ExecQual() call.
> >
> > I seems to me your proposition tends to assume a particular design
> > towards FDW drivers, however, we already have various kind of FDW
> > drivers not only wrapper of remote RDBMS.
> > https://wiki.postgresql.org/wiki/Foreign_data_wrappers
> >
> > Is the [1] and [2] suitable for "all" of them, actually?
> >
> > Let's assume a FDW module that implements own columnar storage,
> > has a special JOIN capability if both side are its columnar storage.
> > Does it need alternative sub-plan for EPQ rechecks? Probably no,
> > because it has own capability to run JOIN by itself.
> > It is inconvenience for this FDW if core automatically kicks sub-
> > plan in spite of its own functionality/capability.
> >
> > If potential bugs are concerned, a common part shall be cut down
> > and provided as a utility function. FDW can determine whether it
> > shall be used, but never enforced.
> >
> > Thanks,
> > --
> > NEC Business Creation Division / PG-Strom Project
> > KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2015-08-25 02:03:41 pg_controldata output alignment regression
Previous Message Kouhei Kaigai 2015-08-25 00:49:12 Re: [DESIGN] ParallelAppend