Re: assessing parallel-safety

From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: assessing parallel-safety
Date: 2015-02-14 05:09:59
Message-ID: 20150214050959.GB3906203@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 13, 2015 at 05:13:06PM -0500, Robert Haas wrote:
> On Fri, Feb 13, 2015 at 12:10 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > Given your wish to optimize, I recommend first investigating the earlier
> > thought to issue eval_const_expressions() once per planner() instead of once
> > per subquery_planner(). Compared to the parallelModeRequired/parallelModeOK
> > idea, it would leave us with a more maintainable src/backend/optimizer. I
> > won't object to either design, though.
>
> In off-list discussions with Tom Lane, he pressed hard on the question
> of whether we can zero out the number of functions that are
> parallel-unsafe (i.e. can't be run while parallel even in the master)
> vs. parallel-restricted (must be run in the master rather than
> elsewhere). The latter category can be handled by strictly local
> decision-making, without needing to walk the entire plan tree; e.g.
> parallel seq scan can look like this:
>
> Parallel Seq Scan on foo
> Filter: a = pg_backend_pid()
> Parallel Filter: b = 1
>
> And, indeed, I was pleasantly surprised when surveying the catalogs by
> how few functions were truly unsafe, vs. merely needing to be
> restricted to the master. But I can't convince myself that there's
> any way sane of allowing database writes even in the master; creating
> new combo CIDs there seems disastrous, and users will be sad if a
> parallel plan is chosen for some_plpgsql_function_that_does_updates()
> and this then errors out because of parallel mode.

Yep. The scarcity of parallel-unsafe, built-in functions reflects the
dominant subject matter of built-in functions. User-defined functions are
more diverse. It would take quite a big hammer to beat the parallel-unsafe
category into irrelevancy.

> Tom also argued that (1) trying to assess parallel-safety before
> preprocess_expressions() was doomed to fail, because
> preprocess_expressions() can additional function calls via, at least,
> inlining and default argument insertion and (2)
> preprocess_expressions() can't be moved earlier than without changing
> the semantics. I'm not sure if he's right, but those are sobering
> conclusions. Andres pointed out to me via IM that inlining is
> dismissable here; if inlining introduces a parallel-unsafe construct,
> the inlined function was mislabeled to begin with, and the user has
> earned the error message they get. Default argument insertion might
> not be dismissable although the practical risks seem low.

All implementation difficulties being equal, I would opt to check for parallel
safety after inserting default arguments and before inlining. Checking before
inlining reveals the mislabeling every time instead of revealing it only when
inline_function() gives up. Timing of the parallel safety check relative to
default argument insertion matters less. Remember, the risk is merely that a
user will find cause to remove a parallel-safe marking where he/she expected
the system to deduce parallel unsafety. If implementation difficulties lead
to some other timings, that won't bother me.

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-02-14 05:16:49 Re: Strange assertion using VACOPT_FREEZE in vacuum.c
Previous Message Atri Sharma 2015-02-14 04:33:05 Re: Support UPDATE table SET(*)=...