Re: Foreign join pushdown vs EvalPlanQual

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: Foreign join pushdown vs EvalPlanQual
Date: 2015-11-20 13:45:17
Message-ID: 9A28C8860F777E439AA12E8AEA7694F801174091@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 2015/11/19 12:32, Robert Haas wrote:
> > On Tue, Nov 17, 2015 at 8:47 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> >> The attached patch is the portion cut from the previous EPQ recheck
> >> patch.
>
> > Thanks, committed.
>
> Thanks, Robert and KaiGai-san.
>
> Sorry, I'm a bit late to the party. Here are my questions:
>
> * This patch means we can define fdw_recheck_quals even for the case of
> foreign tables with non-NIL fdw_scan_tlist. However, we discussed in
> another thread [1] that such foreign tables might break EvalPlanQual
> tests. Where are we on that issue?
>
In case of later locking, RefetchForeignRow() will set a base tuple
that have compatible layout of the base relation, not fdw_scan_tlist,
because RefetchForeignRow() does not have information about scan node.
Here is two solutions. 1) You should not use fdw_scan_tlist for the
FDW that uses late locking mechanism. 2) recheck callback applies
projection to fit fdw_scan_tlist (that is not difficult to provide
as a utility function by the core).

Even though we allow to set up fdw_scan_tlist on simple scan cases,
it does not mean it works for any cases.

> * For the case of foreign joins, I think fdw_recheck_quals can be
> defined for example, the same way as for the case of foreign tables, ie,
> quals not in scan.plan.qual, or ones defined as "otherclauses"
> (rinfo->is_pushed_down=true) pushed down to the remote. But since it's
> required that the FDW has to add to the fdw_scan_tlist the set of
> columns needed to check quals in fdw_recheck_quals in preparation for
> EvalPlanQual tests, it's likely that fdw_scan_tlist will end up being
> long, leading to an increase in a total data transfer amount from the
> remote. So, that seems not practical to me. Maybe I'm missing
> something, but what use cases are you thinking?
>
It is trade-off. What solution do you think we can have?
To avoid data transfer used for EPQ recheck only, we can implement
FDW driver to issue remote join again on EPQ recheck, however, it
is not a wise design, isn't it?

If we would be able to have no extra data transfer and no remote
join execution during EPQ recheck, it is a perfect.
However, we have to take both advantage and disadvantage when
we determine an implementation. We usually choose a way that
has more advantage than disadvantage, but it does not mean no
disadvantage.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2015-11-20 14:03:17 Re: Selective logical replication
Previous Message Robert Haas 2015-11-20 13:36:50 Re: [DESIGN] ParallelAppend