Re: postgres_fdw bug in 9.6

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgres_fdw bug in 9.6
Date: 2016-12-21 16:04:32
Message-ID: CAFjFpRd9oZEkur9aOrMh2Toxq4NYuUSw2kB9S+O=uvZ4xcR=Sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Some review comments

1. postgres_fdw doesn't push down semi and anti joins so you may want to
discount these two too.
+ jointype == JOIN_SEMI ||
+ jointype == JOIN_ANTI);

2. We should try to look for other not-so-cheap paths if the cheapest one is
paramterized. You might want to use get_cheapest_path_for_pathkeys() to find a
suitable unparameterized path by passing NULL for required_outer and NIL for
pathkeys, that's a very strange usage, but I think it will serve the purpose.
On the thread we discussed that we should save the epq_path create for lower
join and use it here. That may be better than searching for a path.
+ /* Give up if the cheapest-total-cost paths are parameterized. */
+ if (!bms_is_empty(PATH_REQ_OUTER(outer_path)) ||
+ !bms_is_empty(PATH_REQ_OUTER(inner_path)))
+ return NULL;

3. Adding new members to JoinPathExtraData may save some work for postgres_fdw
and other FDWs which would use CreateLocalJoinPath(), but it will add few bytes
to the structure even when there is no "FULL foreign join which requires EPQ"
involved in the query. That may not be so much of memory overhead since the
structure is used locally to add_paths_to_joinrel(), but it might be something
to think about. Instead, what if we call select_mergejoin_clauses() within
CreateLocalJoinPath() to get that information?

3. Talking about saving some CPU cycles - if a clauseless full join can be
implemented only using merge join, probably that's the only path available in
the list of paths for the given relation. Instead of building the same path
anew, should we use the existing path like GetExistingLocalJoinPath() for that
case? In fact, I am wondering whether we should look for an existing nestloop
path for all joins except full, in which case we should look for merge or hash
join path. We go on building a new path, if only there isn't an existing one.
That will certainly save some CPU cycles spent in costing the path.

4. Following comment mentions only hash join, but the code creates a merge join
or hash join path.
* If the jointype is JOIN_FULL, try to create a hashjoin join path from

5. Per comment below a clauseless full join can be implemented using only merge
join. Why are we checking extra->mergejoin_allowed? Shouldn't it be true here?
/*
* Special corner case: for "x FULL JOIN y ON true", there will be no
* join clauses at all. Note that mergejoin is our only join type
* that supports FULL JOIN without any join clauses, and in that case
* it doesn't require the input paths to be well ordered, so generate
* a clauseless mergejoin path from the cheapest-total-cost paths.
*/
if (extra->mergejoin_allowed && !extra->mergeclause_list)

Rethinking about the problem, the error comes because the inner or outer plan
of a merge join plan doesn't have pathkeys required by the merge join. This
happens because the merge join path had foreign path with pathkeys as inner or
outer path and corresponding fdw_outerpath didn't have those pathkeys. I am
wondering whether the easy and possibly correct solution here is to not replace
a ForeignPath with fdw_outerpath in GetExistingLocalJoinPath()? If we don't do
that, there won't be error building merge join plan and
postgresRecheckForeignScan() would correctly route the EPQ checks to the local
plan available as outer plan. Attached patch with that code removed.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
epq_plan_failure.patch text/x-patch 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-12-21 16:16:00 Re: Hang in pldebugger after git commit : 98a64d0
Previous Message Robert Haas 2016-12-21 15:56:41 Re: Cache Hash Index meta page.