Re: Foreign join pushdown vs EvalPlanQual

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, 花田茂 <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: Foreign join pushdown vs EvalPlanQual
Date: 2015-09-29 12:38:54
Message-ID: 9A28C8860F777E439AA12E8AEA7694F80114C568@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Etsuro Fujita
> Sent: Tuesday, September 29, 2015 8:00 PM
> To: Kaigai Kouhei(海外 浩平); Robert Haas
> Cc: PostgreSQL-development; 花田茂
> Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
>
> On 2015/09/29 17:49, Kouhei Kaigai wrote:
> >> From: pgsql-hackers-owner(at)postgresql(dot)org
> >> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Etsuro Fujita
>
> >>> RefetchForeignRow() does not take ForeignScanState as its argument,
> >>> so it is not obvious to access its private field, isn't it?
> >>> ExecRowMark contains "rti" field, so it might be feasible to find out
> >>> the target PlanState using walker routine recently supported, although
> >>> it is not a simple enough.
> >>> Unless we don't have reference to the private field, it is not feasible
> >>> to access expression that was pushed down to the remote-side, therefore,
> >>> it does not allow to apply proper rechecks here.
>
> >>> Could you introduce us (1) how to access private data field of
> >>> ForeignScanState from the RefetchForeignRow callback?
>
> >> For the foreign table case (scanrelid>0), I imagined an approach
> >> different than yours. In that case, I thought the issue would be
> >> probably addressed by just modifying the remote query performed in
> >> RefetchForeignRow, which would be of the form "SELECT ctid, * FROM
> >> remote table WHERE ctid = $1", so that the modified query would be of
> >> the form "SELECT ctid, * FROM remote table WHERE ctid = $1 AND *remote
> >> quals*".
>
> Sorry, I forgot to add "FOR UPDATE" to the before/after queries.
>
> > My question is how to pull expression of the remote query.
> > It shall be stored at somewhere private field of ForeignScanState,
> > however, RefetchForeignRow does not have direct access to the
> > relevant ForeignScanState node.
> > It is what I asked at the question (1).
>
> I imagined the following steps to get the remote query string: (1)
> create the remote query string and store it in fdw_private during
> postgresGetForeignPlan, (2) extract the string from fdw_private and
> store it in erm->ermExtra during postgresBeginForeignScan, and (3)
> extract the string from erm->ermExtra in postgresRefetchForeignRow.
>
> > Also note that EvalPlanQualFetchRowMarks() will raise an error
> > if RefetchForeignRow callback returned NULL tuple.
> > Is it right or expected behavior?
>
> IIUC, I think that that behavior is reasonable.
>
> > It looks to me this callback is designed to pull out a particular
> > tuple identified by the remote-row-id, regardless of the qualifier
> > checks based on the latest value.
>
> Because erm->markType==ROW_MARK_REFERENCE, I don't think that that
> behavior would cause any problem. Maybe I'm missing something, though.
>
Really?

ExecLockRows() calls EvalPlanQualFetchRowMarks() to fill up EPQ tuple
slot prior to EvalPlanQualNext(), because these tuples are referenced
during EPQ rechecks.
The purpose of EvalPlanQualNext() is evaluate whether the current bunch
of rows are visible towards the qualifiers of underlying scan/join.
Then, if not visible, it *ignores* the current tuples, as follows.

/*
* Now fetch any non-locked source rows --- the EPQ logic knows how to
* do that.
*/
EvalPlanQualSetSlot(&node->lr_epqstate, slot);
EvalPlanQualFetchRowMarks(&node->lr_epqstate); <--- LOAD REMOTE ROWS

/*
* And finally we can re-evaluate the tuple.
*/
slot = EvalPlanQualNext(&node->lr_epqstate); <--- EVALUATE QUALIFIERS
if (TupIsNull(slot))
{
/* Updated tuple fails qual, so ignore it and go on */
goto lnext; <-- IGNORE THE ROW, NOT RAISE AN ERROR
}

What happen if RefetchForeignRow raise an error in case when the latest
row exists but violated towards the "remote quals" ?
This is the case to be ignored, unlike the case when remote row identified
by row-id didn't exist.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-09-29 12:39:07 Re: Improving test coverage of extensions with pg_dump
Previous Message Taiki Kondo 2015-09-29 12:33:11 Re: [Proposal] Table partition + join pushdown