Re: Enabling parallelism for queries coming from SQL or other PL functions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling parallelism for queries coming from SQL or other PL functions
Date: 2017-03-07 20:24:55
Message-ID: CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 7, 2017 at 9:07 AM, Rafia Sabih
<rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
> I have split the patch into two, one is to allow optimiser to select a
> parallel plan for queries in PL functions
> (pl_parallel_opt_support_v1.patch), wherein CURSOR_OPT_PARALLEL_OK is passed
> at required places.
>
> Next, the patch for allowing execution of such queries in parallel mode,
> that involves infrastructural changes along the lines mentioned upthread
> (pl_parallel_exec_support_v1.patch).

Logically, these patches go in the other order: you have to make the
infrastructure changes first, and then after that you can enable
parallelism in the places that are now safe.

I think any patch that bases the determination of whether it's OK to
use parallel query on the DestReceiver type is unacceptable. There's
just no logical reason to believe that every place that uses a certain
set of DestReceiver types is OK and others are not. What matters is
whether the query can ever be executed more than once, and that's got
to be tracked explicitly.

Here's a draft patch showing the sort of thing I have in mind. I
think it needs more work, but it gives you the idea, I hope. This is
loosely based on your pl_parallel_exec_support_v1.patch, but what I've
done here is added some flags that carry the information about whether
there will be only one or maybe more than one call to ExecutorRun to a
bunch of relevant places.

I think this might have the effect of disabling parallel query in some
cases where PL/pgsql currently allows it, and I think that may be
necessary. (We may need to back-patch a different fix into 9.6.)
There are two places where we currently set CURSOR_OPT_PARALLEL_OK in
PL/pgsql: exec_stmt_return_query() sets it when calling
exec_dynquery_with_params(), and exec_run_select() calls it when
calling exec_prepare_plan() if parallelOK is set. The latter is OK,
because exec_run_select() runs the plan via
SPI_execute_plan_with_paramlist(), which calls _SPI_execute_plan(),
which calls _SPI_pquery(). But the former is broken, because
exec_stmt_return_query() executes the query by calling
SPI_cursor_fetch() with a fetch count of 50, and that calls
_SPI_cursor_operation() which calls PortalRunFetch() -- and of course
each call to PortalRunFetch() is going to cause a separate call to
PortalRunSelect(), resulting in a separate call to ExecutorRun().
Oops.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
execute-once.patch application/octet-stream 16.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-07 20:27:36 Re: ANALYZE command progress checker
Previous Message Andres Freund 2017-03-07 19:38:57 Re: SQL/JSON in PostgreSQL