Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
Date: 2016-04-14 06:20:45
Message-ID: CAFjFpReoTXaAGU5J5eo0tY46v+-MnDem3HhrXM3iLvWG_mYPNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 14, 2016 at 8:42 AM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
wrote:

> On 2016/03/29 23:20, Ashutosh Bapat wrote:
>
>> I think the reason for that is in foreign_join_ok. This in that
>> function:
>>
>> wrongly pulls up remote_conds from joining relations in the FULL
>> JOIN case. I think we should not pull up such conditions in the
>> FULL JOIN case.
>>
>
> Right. For a full outer join, since each joining relation acts as outer
>> for the other, we can not pull up the quals to either join clauses or
>> other clauses. So, in such a case, we will need to encapsulate the
>> joining relation with conditions into a subquery. Unfortunately, the
>> current deparse logic does not handle this encapsulation. Adding that
>> functionality so close to the feature freeze might be risky given the
>> amount of code changes required.
>>
>> PFA patch with a quick fix. A full outer join with either of the joining
>> relations having WHERE conditions (or other clauses) is not pushed down.
>> In the particular case that was reported, the bug triggered because of
>> the way conditions are handled for an inner join. For an inner join, all
>> the conditions in ON as well as WHERE clause are treated like they are
>> part of WHERE clause. This allows pushing down a join even if there are
>> unpushable join clauses. But the pushable conditions can be put back
>> into the ON clause. This avoids using subqueries while deparsing.
>>
>
> I'm not sure that is a good idea. As you mentioned, we could support FULL
> JOIN fully, by encapsulating a joining relation with conditions into a
> subquery. And ISTM that it is relatively easy to do that, by borrowing
> some ideas from Hanada-san's original join pushdown patch. If it's OK,
> I'll create a patch for that in a few days.
>

In his patch the deparsing targetlist and conditions required that the FROM
clause entries were ready with the columns from base relations and joins
realiased. That's no more true. We deparse every Var node as <relation
alias>.<column name> where relation alias is nothing but rN; N being index
of RangeTblEntry. So, Hanada-san's method to deparse recursively can not be
applied as such now.

Here's what needs to be done:
When we identify that certain relation (base or join) needs a subquery to
be deparsed (because the join relation above it could not pull the quals
up), we remember it in the upper join relation. Before deparsing 1. we walk
the join tree and collect targetlists of all such relations, 2. associate
column aliases with those targetlists (save the column alises in resname?)
and craft a relation alias 3. associate the relations alias, column
aliases and targetlists with the base relations involved in such relations
(may be creating a list similar to rtable). While deparsing a Var node, we
check if corresponding base relation is itself or part of a relation
deparsed as a subquery. If it is then we lookup that Var in the targetlist
associated with the base relation and use corresponding relation and column
alias for deparsing it. Otherwise, deparse it as <relation alias>.<column
name> usually.

That looks like a big code change to go after feature freeze. So, we will
leave it for 9.7, unless RMT allows us to introduce that change.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-04-14 06:31:21 Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
Previous Message Jeff Janes 2016-04-14 05:58:53 Re: Detrimental performance impact of ringbuffers on performance