Re: Foreign join pushdown vs EvalPlanQual

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: Foreign join pushdown vs EvalPlanQual
Date: 2015-11-27 06:33:17
Message-ID: 5657F92D.2010902@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015/11/27 0:14, Kouhei Kaigai wrote:
>> On 2015/11/26 14:04, Kouhei Kaigai wrote:
>>> The attached patch adds: Path *fdw_outerpath field to ForeignPath node.
>>> FDW driver can set arbitrary but one path-node here.
>>> After that, this path-node shall be transformed to plan-node by
>>> createplan.c, then passed to FDW driver using GetForeignPlan callback.

>> I understand this, as I also did the same thing in my patches, but
>> actually, that seems a bit complicated to me. Instead, could we keep
>> the fdw_outerpath in the fdw_private of a ForeignPath node when creating
>> the path node during GetForeignPaths, and then create an outerplan
>> accordingly from the fdw_outerpath stored into the fdw_private during
>> GetForeignPlan, by using create_plan_recurse there? I think that that
>> would make the core involvment much simpler.

> How to use create_plan_recurse by extension? It is a static function.

I was just thinking a change to make that function extern.

>>> We expect FDW driver set this plan-node on lefttree (a.k.a outerPlan).
>>> The Plan->outerPlan is a common field, so patch size become relatively
>>> small. FDW driver can initialize this plan at BeginForeignScan, then
>>> execute this sub-plan-tree on demand.

>> Another idea would be to add the core support for
>> initializing/closing/rescanning the outerplan tree when the tree is given.

> No. Please don't repeat same discussion again.

IIUC, I think your point is to allow FDWs to do something else, instead
of performing a local join execution plan, during RecheckForeignScan.
So, what's wrong with the core doing that support in that case?

>> @@ -85,6 +86,18 @@ ForeignRecheck(ForeignScanState *node, TupleTableSlot
>> *slot)
>>
>> ResetExprContext(econtext);
>>
>> + /*
>> + * FDW driver has to recheck visibility of EPQ tuple towards
>> + * the scan qualifiers once it gets pushed down.
>> + * In addition, if this node represents a join sub-tree, not
>> + * a scan, FDW driver is also responsible to reconstruct
>> + * a joined tuple according to the primitive EPQ tuples.
>> + */
>> + if (fdwroutine->RecheckForeignScan)
>> + {
>> + if (!fdwroutine->RecheckForeignScan(node, slot))
>> + return false;
>> + }
>>
>> Maybe I'm missing something, but I think we should let FDW do the work
>> if scanrelid==0, not just if fdwroutine->RecheckForeignScan is given.
>> (And if scanrelid==0 and fdwroutine->RecheckForeignScan is not given, we
>> should abort the transaction.)

> It should be Assert(). The node with scanrelid==0 never happen
> unless FDW driver does not add such a path explicitly.

That's an idea. But the abort seems to me more consistent with other
places (see eg, RefetchForeignRow in EvalPlanQualFetchRowMarks).

>> Another thing I'm concerned about is
>>
>> @@ -347,8 +355,26 @@ ExecScanReScan(ScanState *node)
>> {
>> Index scanrelid = ((Scan *)
>> node->ps.plan)->scanrelid;
>>
>> - Assert(scanrelid > 0);
>> + if (scanrelid > 0)
>> + estate->es_epqScanDone[scanrelid - 1] = false;
>> + else
>> + {
>> + Bitmapset *relids;
>> + int rtindex = -1;
>> +
>> + if (IsA(node->ps.plan, ForeignScan))
>> + relids = ((ForeignScan *)
>> node->ps.plan)->fs_relids;
>> + else if (IsA(node->ps.plan, CustomScan))
>> + relids = ((CustomScan *)
>> node->ps.plan)->custom_relids;
>> + else
>> + elog(ERROR, "unexpected scan node: %d",
>> + (int)nodeTag(node->ps.plan));
>>
>> - estate->es_epqScanDone[scanrelid - 1] = false;
>> + while ((rtindex = bms_next_member(relids, rtindex)) >=
>> 0)
>> + {
>> + Assert(rtindex > 0);
>> + estate->es_epqScanDone[rtindex - 1] = false;
>> + }
>> + }
>> }
>>
>> That seems the outerplan's business to me, so I think it'd be better to
>> just return, right before the assertion, as I said before. Seen from
>> another angle, ISTM that FDWs that don't use a local join execution plan
>> wouldn't need to be aware of handling the es_epqScanDone flags. (Do you
>> think that such FDWs should do something like what ExecScanFtch is doing
>> about the flags, in their RecheckForeignScans? If so, I think we need
>> docs for that.)

> Execution of alternative local subplan (outerplan) is discretional.
> We have to pay attention FDW drivers which handles EPQ recheck by
> itself. Even though you argue callback can violate state of
> es_epqScanDone flags, it is safe to follow the existing behavior.

So, I think the documentation needs more work.

Yet another thing that I'm concerned about is

@@ -3747,7 +3754,8 @@ make_foreignscan(List *qptlist,
List *fdw_exprs,
List *fdw_private,
List *fdw_scan_tlist,
- List *fdw_recheck_quals)
+ List *fdw_recheck_quals,
+ Plan *fdw_outerplan)
{
ForeignScan *node = makeNode(ForeignScan);
Plan *plan = &node->scan.plan;
@@ -3755,7 +3763,7 @@ make_foreignscan(List *qptlist,
/* cost will be filled in by create_foreignscan_plan */
plan->targetlist = qptlist;
plan->qual = qpqual;
- plan->lefttree = NULL;
+ plan->lefttree = fdw_outerplan;
plan->righttree = NULL;
node->scan.scanrelid = scanrelid;

I think that that would break the EXPLAIN output. One option to avoid
that is to set the fdw_outerplan in ExecInitForeignScan as in my patch
[1], or BeginForeignScan as you proposed. That breaks the equivalence
that the Plan tree and the PlanState tree should be mirror images of
each other, but I think that that break would be harmless.

Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/55DEF5F0.308@lab.ntt.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-11-27 06:42:29 Re: Error with index on unlogged table
Previous Message Kouhei Kaigai 2015-11-27 06:25:34 Re: Foreign join pushdown vs EvalPlanQual