Re: postgres_fdw vs. force_parallel_mode on ppc

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgres_fdw vs. force_parallel_mode on ppc
Date: 2016-02-23 06:17:29
Message-ID: 7651.1456208249@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Feb 23, 2016 at 2:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Foreign tables are supposed to be categorically excluded from
>>> parallelism. Not sure why that's not working in this instance.

>> BTW, I wonder where you think that's supposed to be enforced, because
>> I sure can't find any such logic.

> RTEs are checked in set_rel_consider_parallel(), and I thought there
> was a check there related to foreign tables, but there isn't. Oops.

Even if there were, it would not fix this bug, because AFAICS the only
thing that set_rel_consider_parallel is chartered to do is set the
per-relation consider_parallel flag. The failure that is happening in
that regression test with force_parallel_mode turned on happens because
standard_planner plasters a Gather node at the top of the plan, causing
the whole plan including the FDW access to happen inside a parallel
worker. The only way to prevent that is to clear the
wholePlanParallelSafe flag, which as far as I can tell (not that any of
this is documented worth a damn) isn't something that
set_rel_consider_parallel is supposed to do.

It looks to me like there is a good deal of fuzzy thinking here about the
difference between locally parallelizable and globally parallelizable
plans, ie Gather at the top vs Gather somewhere else. I also note with
dismay that turning force_parallel_mode on seems to pretty much disable
any testing of local parallelism.

> In view of 69d34408e5e7adcef8ef2f4e9c4f2919637e9a06, we shouldn't
> blindly assume that foreign scans are not parallel-safe, but we can't
> blindly assume the opposite either. Maybe we should assume that the
> foreign scan is parallel-safe only if one or more of the new methods
> introduced by the aforementioned commit are set, but actually that
> doesn't seem quite right. That would tell us whether the scan itself
> can be parallelized, not whether it's safe to run serially but within
> a parallel worker. I think maybe we need a new FDW API that gets
> called from set_rel_consider_parallel() with the root, rel, and rte as
> arguments and which can return a Boolean. If the callback is not set,
> assume false.

Meh. As things stand, postgres_fdw would have to aver that it can't ever
be safely parallelized, which doesn't seem like a very satisfactory answer
even if there are other FDWs that work differently (and which would those
be? None that use a socket-style connection to an external server.)

The commit you mention above seems to me to highlight the dangers of
accepting hook patches with no working use-case to back them up.
AFAICT it's basically useless for typical FDWs because of this
multiple-connection problem.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2016-02-23 06:18:30 Re: Optimization for updating foreign tables in Postgres FDW
Previous Message Robert Haas 2016-02-23 05:45:46 Re: postgres_fdw vs. force_parallel_mode on ppc