Re: assessing parallel-safety

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: assessing parallel-safety
Date: 2015-02-11 20:21:12
Message-ID: CA+TgmoZuW0eYEVqFyTgjE15sTdL2vhx9e1ECK0_RkLYY-kLDcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 11, 2015 at 9:39 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Feb 8, 2015 at 12:28 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Sun, Feb 8, 2015 at 11:31 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>>> On Sat, Feb 07, 2015 at 08:18:55PM -0500, Robert Haas wrote:
>>>> There are a few problems with this design that I don't immediately
>>>> know how to solve:
>>>>
>>>> 1. I'm concerned that the query-rewrite step could substitute a query
>>>> that is not parallel-safe for one that is. The upper Query might
>>>> still be flagged as safe, and that's all that planner() looks at.
>>>
>>> I would look at determining the query's parallel safety early in the planner
>>> instead; simplify_function() might be a cheap place to check. Besides
>>> avoiding rewriter trouble, this allows one to alter parallel safety of a
>>> function without invalidating Query nodes serialized in the system catalogs.
>>
>> Thanks, I'll investigate that approach.
>
> This does not seem to work out nicely. The problem here is that
> simplify_function() gets called from eval_const_expressions() which
> gets called from a variety of places, but the principal one seems to
> be subquery_planner(). So if you have a query with two subqueries,
> and the second one contains something parallel-unsafe, you might by
> that time have already generated a parallel plan for the first one,
> which won't do. Unless we want to rejigger this so that we do a
> complete eval_const_expressions() pass over the entire query tree
> (including all subqueries) FIRST, and then only after that go back and
> plan all of those subqueries, I don't see how to make this work; and
> I'm guessing that there are good reasons not to do that.

And ... while I was just talking with Jan about this, I realized
something that should have been blindingly obvious to me from the
beginning, which is that anything we do at parse time is doomed to
failure, because the properties of a function that we use to determine
parallel-safety can be modified later:

rhaas=# alter function random() immutable; -- no it isn't
ALTER FUNCTION

I think we may want a dedicated parallel-safe property for functions
rather than piggybacking on provolatile, but that will probably also
be changeable via ALTER FUNCTION, and stored rules won't get
miraculously updated. So this definitely can't be something we figure
out at parse-time ... it's got to be determined later. But at the
moment I see no way to do that without an extra pass over the whole
rewritten query tree. :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-02-11 20:25:53 Re: reducing our reliance on MD5
Previous Message Heikki Linnakangas 2015-02-11 20:18:22 Re: reducing our reliance on MD5