Re: PATCH: enabling parallel execution for cursors explicitly (experimental)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: enabling parallel execution for cursors explicitly (experimental)
Date: 2017-10-31 22:17:05
Message-ID: 8bdb6684-09d7-f799-0a6a-362cdc251b31@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 10/20/2017 03:23 PM, Robert Haas wrote:
>
> ...
>
> The main points I want to make clearly understood is the current
> design relies on (1) functions being labeled correctly and (2) other
> dangerous code paths being unreachable because there's nothing that
> runs between EnterParallelMode and ExitParallelMode which could invoke
> them, except by calling a mislabeled function. Your patch expands the
> vulnerability surface from "executor code that can be reached without
> calling a mislabeled function" to "any code that can be reached by
> typing an SQL command". Just rejecting any queries that are
> parallel-unsafe probably closes a good chunk of the holes, but that
> still leaves a lot of code that's never been run in parallel mode
> before potentially now running in parallel mode - e.g. any DDL command
> you happen to type, transaction control commands, code that only runs
> when the server is idle like idle_in_transaction_timeout, cursor
> operations. A lot of that stuff is probably fine, but it's got to be
> thought through. Error handling might be a problem, too: what happens
> if a parallel worker is killed while the query is suspended? I
> suspect that doesn't work very nicely at all.
>

OK, understood and thanks for explaining what may be the possible
issues. I do appreciate that.

I still think it'd be valuable to support this, though, so I'm going to
spend more time on investigating what needs to be handled.

But maybe there's a simpler option - what if we only allow fetches from
the PARALLEL cursor while the cursor is open? That is, this would work:

BEGIN;
...
DECLARE x PARALLEL CURSOR FOR SELECT * FROM t2 WHERE ...;
FETCH 1000 FROM x;
FETCH 1000 FROM x;
FETCH 1000 FROM x;
CLOSE x;
...
COMMIT;

but adding any other command between the OPEN/CLOSE commands would fail.
That should close all the holes with parallel-unsafe stuff, right?

Of course, this won't solve the issue with error handling / killing
suspended workers (which didn't occur to me before as a possible issue
at all, so that's for pointing that out). But that's a significantly
more limited issue to fix than all the parallel-unsafe bits.

Now, I agree this is somewhat more limited than I hoped for, but OTOH it
still solves the issue I initially aimed for (processing large query
results in efficient way).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-10-31 22:24:06 Re: WIP: Restricting pg_rewind to data/wal dirs
Previous Message Rob McColl 2017-10-31 22:14:04 Re: PostgreSQL 10 parenthesized single-column updates can produce errors