assessing parallel-safety

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: assessing parallel-safety
Date: 2015-02-08 01:18:55
Message-ID: CA+TgmoarOjAY6v+WJEKObAQjGH5aU0ys-cytEdsW_E25csoVig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit's parallel sequential scan assumes that we can enter parallel
mode when the parallel sequential scan is initialized and exit
parallel mode when the scan ends and all the code that runs in between
will be happy with that. Unfortunately, that's not necessarily the
case. There are two ways it can fail:

1. Some other part of the query can contain functions that are not
safe to run in parallel-mode; e.g. a PL/pgsql function that writes
data or uses subtransactions.
2. The user can run partially execute the query and then, while
execution is suspended, go do something not parallel-safe with the
results before resuming query execution.

To properly assess whether a query is parallel-safe, we need to
inspect the entire query for non-parallel-safe functions. We also
need the code that's going to execute the plan to tell us whether or
not they might want to do not-parallel-safe things between the time we
start running the query and the time we finish running it. So I tried
writing some code to address this; a first cut is attached. Here's
what it does:

1. As we parse each query, it sets a flag in the parse-state if we see
a non-immutable function. For the time being, I'm assuming immutable
== parallel-safe, although that's probably not correct in detail. It
also sets the flag if it sees a data-modifying operation, meaning an
insert, update, delete, or locking clause. The point of this is to
avoid making an extra pass over the query just to assess
parallel-safety; we want to accumulate that information as we go
along.

2. When parsing is complete, the parse-state flag is copied into the
Query, similar to what we already do for flags like hasModifyingCTE.

3. When the query is planned, planner() sets a flag in the
PlannerGlobal called parallelModeOK if the Query is not marked as
parallel-mode unsafe. There's also a new cursor option,
CURSOR_OPT_NO_PARALLEL, with forces parallelModeOK to false regardless
of what the Query says. It initializes another flag
parallelModeNeeded to false as well. The idea here is that before
generating a parallel path, the planner should examine parallelModeOK
and skip it if that's false. If we end up creating a plan from a
parallel path, then the plan-generation function should set
parallelModeNeeded.

4. At the conclusion of planning, the parallelModeNeeded flag is
copied from the PlannerGlobal to the PlannedStmt.

5. ExecutorStart() calls EnterParallelMode() if parallelModeNeeded is
set and we're not already in parallel mode. ExecutorEnd() calls
ExitParallelMode() if EnterParallelMode() was called in
ExecutorStart().

There are a few problems with this design that I don't immediately
know how to solve:

1. I'm concerned that the query-rewrite step could substitute a query
that is not parallel-safe for one that is. The upper Query might
still be flagged as safe, and that's all that planner() looks at.

2. Interleaving the execution of two parallel queries by firing up two
copies of the executor simultaneously can result in leaving parallel
mode at the wrong time.

3. Any code using SPI has to think hard about whether to pass
OPT_CURSOR_NO_PARALLEL. For example, PL/pgsql doesn't need to pass
this flag when caching a plan for a query that will be run to
completion each time it's executed. But it DOES need to pass the flag
for a FOR loop over an SQL statement, because the code inside the FOR
loop might do parallel-unsafe things while the query is suspended.

Thoughts, either on the general approach or on what to do about the problems?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
assess-parallel-safety-v1.patch application/x-patch 31.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-02-08 03:04:05 Re: Parallel Seq Scan
Previous Message Andreas Karlsson 2015-02-08 01:05:46 Re: PATCH: Reducing lock strength of trigger and foreign key DDL