Re: Foreign join pushdown vs EvalPlanQual

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: Foreign join pushdown vs EvalPlanQual
Date: 2015-10-22 08:10:44
Message-ID: 56289A04.9080007@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015/10/20 9:36, Kouhei Kaigai wrote:
> Even if we fetch whole-row of both side, join pushdown is exactly working
> because we can receive less number of rows than local join + 2 of foreign-
> scan. (If planner works well, we can expect join-path that increases number
> of rows shall be dropped.)
>
> One downside of my proposition is growth of width for individual rows.
> It is a trade-off situation. The above approach takes no changes for
> existing EPQ infrastructure, thus, its implementation design is clear.
> On the other hands, your approach will reduce traffic over the network,
> however, it is still unclear how we integrate scanrelid==0 with EPQ
> infrastructure.

I agree with KaiGai-san that his proposition (or my proposition based on
secondary plans) is still a performance improvement over the current
implementation on local joining plus early row locking, since that that
wouldn't have to transfer useless data that didn't satisfy join
conditions at all!

> On the other hands, in case of custom-scan that takes underlying local
> scan-nodes, thus, any kind of ROW_MARK_* except for ROW_MARK_COPY will
> happen. I think width of the joined tuples are relatively minor issue
> than FDW cases. However, we cannot expect the fetched rows are protected
> by early row-locking mechanism, so probability of re-fetching rows and
> reconstruction of joined-tuple has relatively higher priority.

I see.

>> There is also some possible loss of efficiency with this approach.
>> Suppose that we have two tables ft1 and ft2 which are being joined,
>> and we push down the join. They are being joined on an integer
>> column, and the join needs to select several other columns as well.
>> However, ft1 and ft2 are very wide tables that also contain some text
>> columns. The query is like this:
>>
>> SELECT localtab.a, ft1.p, ft2.p FROM localtab LEFT JOIN (ft1 JOIN ft2
>> ON ft1.x = ft2.x AND ft1.huge ~ 'stuff' AND f2.huge2 ~ 'nonsense') ON
>> localtab.q = ft1.q;
>>
>> If we refetch each row individually, we will need a wholerow image of
>> ft1 and ft2 that includes all columns, or at least f1.huge and
>> f2.huge2. If we just fetch a wholerow image of the join output, we
>> can exclude those. The only thing we need to recheck is that it's
>> still the case that localtab.q = ft1.q (because the value of
>> localtab.q might have changed).

As KaiGai-san mentioned above, what we need to discuss more about with
Robert's proposition is how to integrate that into the existing EPQ
machinery. For example, when, where, and how should we refetch the
whole-row image of the join output in the case of late row locking? IMV
I think that that would need to add a new FDW API different from
RefetchForeignRow, say RefetchForeignJoinRow.

IMO I think that another benefit from the proposition from KaiGai-san
(or me) would be that that could provide the whole functionality for row
locking in remote joins, without an additional development burden on an
FDW author; the author only has to write GetForeignRowMarkType and
RefetchForeignRow, which I think is relatively easy. I think that in
the proposition, the use of rowmark types such as ROW_MARK_SHARE or
ROW_MARK_EXCLUSIVE for foreign tables in remote joins would be quite
inefficient, but I think that the use of ROW_MARK_REFERENCE instead of
ROW_MARK_COPY would be an option for the workload where EPQ rechecks are
rarely invoked, because we just need to transfer ctids, not whole-row
images.

Best regards,
Etsuro Fujita

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-10-22 09:00:26 Re: [PATCH v3] GSSAPI encryption support
Previous Message Michael Paquier 2015-10-22 07:47:09 Re: [PATCH v3] GSSAPI encryption support