Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thom Brown <thom(at)linux(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Date: 2015-12-08 08:27:52
Message-ID: CAFjFpRfMFRiaCwXypW7eRL+H1W7peFEij=Fb5aDBwDeAx7cLsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 3, 2015 at 12:36 PM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
wrote:

> Hi Ashutosh,
>
> On 2015/12/02 20:45, Ashutosh Bapat wrote:
>
>> It's been a long time since last patch on this thread was posted. I have
>> started
>> to work on supporting join pushdown for postgres_fdw.
>>
>
> Thanks for the work!
>
> Generating paths
>> ================
>>
>
> A join between two foreign relations is considered safe to push down if
>> 1. The joining sides are pushable
>> 2. The type of join is OUTER or INNER (LEFT/RIGHT/FULL/INNER). SEMI and
>> ANTI
>> joins are not considered right now, because of difficulties in
>> constructing
>> the queries involving those. The join clauses of SEMI/ANTI joins are
>> not in a
>> form that can be readily converted to IN/EXISTS/NOT EXIST kind of
>> expression.
>> We might consider this as future optimization.
>> 3. Joining sides do not have clauses which can not be pushed down to the
>> foreign
>> server. For an OUTER join this is important since those clauses need
>> to be
>> applied before performing the join and thus join can not be pushed
>> to the
>> foreign server. An example is
>> SELECT * FROM ft1 LEFT JOIN (SELECT * FROM ft2 where local_cond) ft2
>> ON (join clause)
>> Here the local_cond on ft2 needs to be executed before performing
>> LEFT JOIN
>> between ft1 and ft2.
>> This condition can be relaxed for an INNER join by pulling the local
>> clauses
>> up the join tree. But this needs more investigation and is not
>> considered in
>> this version.
>> 4. The join conditions (e.g. conditions in ON clause) are all safe to
>> push down.
>> This is important for OUTER joins as pushing down join clauses
>> partially and
>> applying rest locally changes the result. There are ways [1] by
>> which partial
>> OUTER join can be completed by applying unpushable clauses locally
>> and then
>> nullifying the nullable side and eliminating duplicate non-nullable
>> side
>> rows. But that's again out of scope of first version of postgres_fdw
>> join
>> pushdown.
>>
>
> As for 4, as commented in the patch, we could relax the requirement that
> all the join conditions (given by JoinPathExtraData's restrictlist) need to
> be safe to push down to the remote server;
> * In case of inner join, all the conditions would not need to be safe.
> * In case of outer join, all the "otherclauses" would not need to be safe,
> while I think all the "joinclauses" need to be safe to get the right
> results (where "joinclauses" and "otherclauses" are defined by
> extract_actual_join_clauses). And I think we should do this relaxation to
> some extent for 9.6, to allow more joins to be pushed down.

agreed. I will work on those.

> I don't know about [1]. May I see more information about [1]?
>
>
Generating plan
>> ===============
>>
>
> Rest of this section describes the logic to construct
>> the SQL
>> for join; the logic is implemented as function deparseSelectSqlForRel().
>>
>> deparseSelectSqlForRel() builds the SQL for given joinrel (and now for
>> baserel
>> asd well) recursively.
>> For joinrels
>> 1. it constructs SQL representing either side of join, by calling itself
>> in recursive fashion.
>> 2. These SQLs are converted into subqueries and become part of the FROM
>> clause
>> with appropriate JOIN type and clauses. The left and right
>> subqueries are
>> given aliases "l" and "r" respectively. The columns in each subquery
>> are
>> aliased as "a1", "a2", "a3" and so on. Thus the third column on left
>> side can
>> be referenced as "l.a3" at any recursion level.
>> 3. Targetlist is added representing the columns in the join result
>> expected at
>> that level.
>> 4. The join clauses are added as part of ON clause
>> 5. Any clauses that planner has deemed fit to be evaluated at that level
>> of join
>> are added as part of WHERE clause.
>>
>
> Honestly, I'm not sure that that is a good idea. One reason for that is
> that a query string constructed by the procedure is difficult to read
> especially when the procedure is applied recursively. So, I'm thinking to
> revise the procedure so as to construct a query string with a flattened
> FROM clause, as discussed in eg, [2].
>
>
Just to confirm, the hook discussed in [2] is not in place right? I can
find only one hook for foreign join
50 typedef void (*GetForeignJoinPaths_function) (PlannerInfo *root,
51 RelOptInfo
*joinrel,
52 RelOptInfo
*outerrel,
53 RelOptInfo
*innerrel,
54 JoinType
jointype,
55 JoinPathExtraData
*extra);
This hook takes an inner and outer relation, so can not be used for N-way
join as discussed in that thread.

Are you suggesting that we should add that hook before we implement join
pushdown in postgres_fdw? Am I missing something?

> TODOs
>> =====
>> This patch is very much WIP patch to show case the approach and invite
>> early
>> comments. I will continue to improve the patch and some of the areas
>> that will
>> be improved are
>> 1. Costing of foreign join paths.
>> 2. Various TODOs in the patch, making it more readable, finishing etc.
>> 3. Tests
>> 4. Any comments/suggestions on approach or the attached patch.
>>
>
> That would be great!
>
> In another thread Robert, Fujita-san and Kaigai-san are discussing about
>> EvalPlanQual support for foreign joins. Corresponding changes to
>> postgres_fdw
>> will need to be added once those changes get committed.
>>
>
> Yeah, we would need those changes including helper functions to create a
> local join execution plan for that support. I'd like to add those changes
> to your updated patch if it's okay.
>
>
Right now, we do not have any support for postgres_fdw join pushdown. I was
thinking of adding at least minimal support for the same using this patch,
may be by preventing join pushdown in case there are row marks for now.
That way, we at least have some way to play with postgres_fdw join
pushdown. Once we have that, we can work on remaining items listed for 9.6
and also you can add suport for row marks with fix for EvalPlanQual
independently. This will keep the first patch smaller. Do you agree or you
want to see EvalPlanQual fix to be in the first patch itself?

> Best regards,
> Etsuro Fujita
>
> [2]
> http://www.postgresql.org/message-id/CA+TgmoZH9PB8BC+Z3rE7wo8CwuxAF7VP3066iSG39QfR1jJ+UQ@mail.gmail.com
>
>
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2015-12-08 08:56:19 Re: Some bugs in psql_complete of psql
Previous Message Michael Paquier 2015-12-08 07:40:11 Re: