Re: Push down more full joins in postgres_fdw

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Push down more full joins in postgres_fdw
Date: 2017-01-30 12:05:18
Message-ID: CAFjFpRdQvx86-3BPvkzEGK=7nC0_k6Nv0XzQsO8fjBcTjWuQdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 30, 2017 at 5:00 PM, Etsuro Fujita
<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2017/01/27 21:25, Etsuro Fujita wrote:
>>
>> On 2017/01/27 20:04, Ashutosh Bapat wrote:
>>>
>>> I think we should pick up your patch on
>>> 27th December, update the comment per your mail on 5th Jan. I will
>>> review it once and list down the things left to committer's judgement.
>
>
>> Sorry, I started thinking we went in the wrong direction. I added to
>> deparseSelectStmtForRel build_subquery_tlists, which creates a tlist for
>> each subquery present in a given join tree prior to deparsing a whole
>> remote query. But that's nothing but an overhead; we need to create a
>> tlist for the top-level query because we use it as fdw_scan_tlist, but
>> not for subqueries, and we need to create retrieved_attrs for the
>> top-level query while deparsing the targetlist in
>> deparseExplicitTargetList, but not for subqueries. So, we should need
>> some work to avoid such a useless overhead.
>
>
> I think we can avoid the useless overhead by adding a new function,
> deparseSubqueryTargetList, that deparses expressions in the given relation's
> reltarget, not the tlist, as a SELECT clause of the subquery representing
> the given relation. That would also allow us to make the 1-to-1
> relationship between the subquery output columns and their aliases more
> explicit, which was your original comment. Please find attached the new
> version. (The patch doesn't need the patch to avoid duplicate construction
> of the tlist, discussed upthread.)

I have not looked at the patch, but the reason we use a tlist instead
of list of expressions is because fdw_scan_tlist is expected in that
form and we don't want two different representations one to deparse
SELECT and one to interpret results from the foreign server. What you
describe above seems to introduce exactly that hazard.

>
> Other changes:
> * I went back to make_outerrel_subquery and make_innerrel_subquery, which
> are flags to indicate whether to deparse the input relations as subqueries.
> is_subquery_rel would work well for handling the cases of full joins with
> restrictions on the input relations, but I noticed that that wouldn't work
> well when extending to handle the cases where we deparse the input relations
> as subqueries to evaluate PHVs remotely.

I had objected to using a single variable instead of two previously
and you had argued against it in [1]. There you had mentioned that PHV
doesn't need two variables, but now you are taking the other stand,
without any apparent reason. Can you please clarify it?

[1] https://www.postgresql.org/message-id/1eb58ee4-8ffa-7c40-1229-c8973e6498ea@lab.ntt.co.jp

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-01-30 12:45:07 Re: Supporting huge pages on Windows
Previous Message Heikki Linnakangas 2017-01-30 12:04:18 Re: Subtle bug in "Simplify tape block format" commit