Re: postgres_fdw vs. force_parallel_mode on ppc

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgres_fdw vs. force_parallel_mode on ppc
Date: 2016-02-23 12:23:16
Message-ID: CA+TgmoZQh4EJ_SbAUWhSPS8uy=Sm+VaUiAGF+foJDjbuFsVLmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 23, 2016 at 11:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Even if there were, it would not fix this bug, because AFAICS the only
> thing that set_rel_consider_parallel is chartered to do is set the
> per-relation consider_parallel flag. The failure that is happening in
> that regression test with force_parallel_mode turned on happens because
> standard_planner plasters a Gather node at the top of the plan, causing
> the whole plan including the FDW access to happen inside a parallel
> worker. The only way to prevent that is to clear the
> wholePlanParallelSafe flag, which as far as I can tell (not that any of
> this is documented worth a damn) isn't something that
> set_rel_consider_parallel is supposed to do.

Hmm. Well, if you tested it, or looked at the places where
wholePlanParallelSafe is cleared, you would find that it DOES fix the
bug. create_plan() clears wholePlanParallelSafe if the plan is not
parallel-safe, and the plan won't be parallel-safe unless
consider_parallel was set for the underlying relation. In case you'd
like to test it for yourself, here's the PoC patch I wrote:

diff --git a/src/backend/optimizer/path/allpaths.c
b/src/backend/optimizer/path/allpaths.c
index bcb668f..8a4179e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -527,6 +527,11 @@ set_rel_consider_parallel(PlannerInfo *root,
RelOptInfo *rel,
return;
return;
}
+
+ /* Not for foreign tables. */
+ if (rte->relkind == RELKIND_FOREIGN_TABLE)
+ return;
+
break;

case RTE_SUBQUERY:

Adding that makes the postgres_fdw case pass.

> It looks to me like there is a good deal of fuzzy thinking here about the
> difference between locally parallelizable and globally parallelizable
> plans, ie Gather at the top vs Gather somewhere else.

If you have a specific complaint, I'm happy to try to improve things,
or you can. I think however that it is also possible that you haven't
fully understood the code I've spent the last year or so developing
yet, possibly because I haven't documented it well enough, but
possibly also because you haven't spent much time looking on it yet.
I'm glad you are, by the way, because I'm sure there are a bunch of
things here that you can improve over what I was able to do,
especially on the planner side of things, and that would be great.
However, a bit of forbearance would be appreciated.

> I also note with
> dismay that turning force_parallel_mode on seems to pretty much disable
> any testing of local parallelism.

No, I don't think so. It doesn't push a Gather node on top of a plan
that already contains a Gather, because such a plan isn't
parallel_safe. Nor does it suppress generation of parallel paths
otherwise.

>> In view of 69d34408e5e7adcef8ef2f4e9c4f2919637e9a06, we shouldn't
>> blindly assume that foreign scans are not parallel-safe, but we can't
>> blindly assume the opposite either. Maybe we should assume that the
>> foreign scan is parallel-safe only if one or more of the new methods
>> introduced by the aforementioned commit are set, but actually that
>> doesn't seem quite right. That would tell us whether the scan itself
>> can be parallelized, not whether it's safe to run serially but within
>> a parallel worker. I think maybe we need a new FDW API that gets
>> called from set_rel_consider_parallel() with the root, rel, and rte as
>> arguments and which can return a Boolean. If the callback is not set,
>> assume false.
>
> Meh. As things stand, postgres_fdw would have to aver that it can't ever
> be safely parallelized, which doesn't seem like a very satisfactory answer
> even if there are other FDWs that work differently (and which would those
> be? None that use a socket-style connection to an external server.)

file_fdw is parallel-safe, and KaiGai posted a patch that makes it
parallel-aware, though that would have needed more work than I'm
willing to put in right now to make it committable. So in other
words...

> The commit you mention above seems to me to highlight the dangers of
> accepting hook patches with no working use-case to back them up.
> AFAICT it's basically useless for typical FDWs because of this
> multiple-connection problem.

...I didn't ignore this principal.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-02-23 12:48:17 Re: Pushing down sorted joins
Previous Message Victor Wagner 2016-02-23 12:04:01 Re: Convert pltcl from strings to objects