Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Date: 2016-02-02 16:32:13
Message-ID: CA+TgmoZzqmvZ0XwD=JFmmKQqr-YYRXmcM-uhurH5Lr2r60+g3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 2, 2016 at 11:21 AM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> Why does deparseSelectStmtForRel change the order of the existing
>> arguments? I have no issue with adding new arguments as required, but
>> rearranging the existing argument order doesn't serve any useful
>> purpose that is immediately apparent.
>
> deparseSelectStmtForRel has two sets of arguments, input and output. They
> are separated in the declaration all inputs come first, followed by all
> outputs. The inputs were ordered according to their appearance in SELECT
> statement, so I added tlist before remote_conds. I should have added
> relations, which is an output argument, at the end, but I accidentally added
> it between existing output arguments. Anyway, I will go ahead and just add
> the new arguments after the existing ones.

No, that's not what I'm asking for, nor do I think it's right. What
I'm complaining about is that originally params_list was after
retrieved_attrs, but in v5 it's before retrieved_attrs. I'm fine with
inserting tlist after rel, or in general inserting new arguments in
the sequence. But you reversed the relative ordering of params_list
and retrieved_attrs.

> I was thinking on the similar lines except rN aliases. I think there will be
> problem for queries like
> postgres=# explain verbose select * from lt left join (select bar.a, foo.b
> from bar left join foo on (bar.a = foo.a) where bar.b + foo.b < 10) q on
> (lt.b = q.b);
> QUERY PLAN
> --------------------------------------------------------------------------------
> Hash Right Join (cost=318.03..872.45 rows=43 width=16)
> Output: lt.a, lt.b, bar.a, foo.b
> Hash Cond: (foo.b = lt.b)
> -> Merge Join (cost=317.01..839.07 rows=8513 width=8)
> Output: bar.a, foo.b
> Merge Cond: (bar.a = foo.a)
> Join Filter: ((bar.b + foo.b) < 10)
> -> Sort (cost=158.51..164.16 rows=2260 width=8)
> Output: bar.a, bar.b
> Sort Key: bar.a
> -> Seq Scan on public.bar (cost=0.00..32.60 rows=2260
> width=8)
> Output: bar.a, bar.b
> -> Sort (cost=158.51..164.16 rows=2260 width=8)
> Output: foo.b, foo.a
> Sort Key: foo.a
> -> Seq Scan on public.foo (cost=0.00..32.60 rows=2260
> width=8)
> Output: foo.b, foo.a
> -> Hash (cost=1.01..1.01 rows=1 width=8)
> Output: lt.a, lt.b
> -> Seq Scan on public.lt (cost=0.00..1.01 rows=1 width=8)
> Output: lt.a, lt.b
> (21 rows)
>
> The subquery q is pulled up, so there won't be trace of q in the join tree
> except may be a useless RTE for the subquery. There will be RelOptInfo
> representing join between lt, bar and foo and a RelOptInfo for join between
> bar and foo. The join filter bar.b + foo.b < 10 needs to be evaluated before
> joining (bar, foo) with lt and should go with bar left join foo. But the
> syntax doesn't support something like "bar left join foo on (bar.a = foo.a)
> where bar.b + foo.b". So we will have to construct a SELECT statement for
> this join and add to the FROM clause with a subquery alias and then refer
> the columns of foo and bar with that subquery alias.

Hmm, does it work if we put bar.b + foo.b < 10 in the ON clause for
the join between lt and foo/bar? I think so...

> Further during the process of qual placement, quals that can be evaluated at
> the level of given relation in the join tree are attached to that relation
> if they can be pushed down. Thus if we see a qual attached to a given
> relation, AFAIU, we can not say whether it needs to be evaluated there
> (similar to above query) or planner pushed it down for optimization, and
> thus for every join relation with quals we will need to build subqueries
> with aliases.

I don't think that's true. I theorize that every qual can either go
into the top level WHERE clause or the ON clause of some join.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2016-02-02 16:34:28 Re: PostgreSQL Auditing
Previous Message Joshua D. Drake 2016-02-02 16:28:54 Re: PostgreSQL Auditing