Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Marko Tiikkaja <marko(at)joh(dot)to>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Fletcher <andy(at)prestigedigital(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Date: 2018-09-03 04:29:43
Message-ID: CAA4eK1Li+g_egv8Q7TFCxzZHSaiKe7Vrt23Nr=m1mysZ3-e_3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Sep 3, 2018 at 12:25 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>
> >> + * Treat window functions as parallel-restricted as the row ordering
> >> + * induced by them is non-deterministic. We can relax this condition for
> >> + * cases where the row ordering can be deterministic like when there is
> >> + * an ORDER BY on the primary key, but those cases don't seem to be
> >> + * interesting enough to have additional checks.
>
> This comment seems fairly confused. I'd say something like
>
> * Treat window functions as parallel-restricted because we aren't sure
> * whether the input row ordering is fully deterministic, and the output
> * of window functions might vary across workers if not. (In some cases,
> * like where the window frame orders by a primary key, we could relax
> * this restriction. But it doesn't currently seem worth expending extra
> * effort to do so.)
>

Changed.

> >> In addition to the above, I have marked all built-in window functions
> >> as parallel-restricted. I think even if we don't do that something
> >> like above check should be sufficient, but OTOH, I don't see any
> >> reason to keep the marking of such functions as parallel-safe. Is
> >> there a reason, why we shouldn't mark them as parallel-restricted?
>
> I am *strongly* against this. It's unnecessary catalog churn that we
> might need to undo someday, and it confuses a property of the window
> function infrastructure with a property of individual window functions.
> As a counterexample, if a window function were parallel-unsafe for
> some reason, we'd surely need to honor that. More realistically,
> someone might add a window function that actually needs to be
> parallel-restricted for reasons of its own, but then there would be
> no obvious distinction between such a function and one that you'd
> hacked up to be marked parallel-restricted even though it's safe in
> itself. If we then do make the sort of optimization suggested in the
> comment, it's likely that someone would just s/r/s/g for all the window
> functions and thereby break such a function.
>

Sounds sensible, I have removed that part from the patch.

You haven't mentioned anything about backpatching, but I don't see any
problem with backpatching this fix. So, I have prepared patches for
back branches wherever required. For 10, it is mostly a cosmetic
change (the patch didn't apply cleanly), but for 9.6 the prohibition
check is slightly different.

I will commit the attached patches in a day or so unless somebody sees
any problem.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
treat_window_func_calc_parallel_restricted_v3.pg10.patch application/octet-stream 3.3 KB
treat_window_func_calc_parallel_restricted_v3.patch application/octet-stream 3.2 KB
treat_window_func_calc_parallel_restricted_v3.pg96.patch application/octet-stream 3.0 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-09-03 13:49:17 Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Previous Message jimmy 2018-09-03 03:10:33 Re:Re: Re: Re: Re: Re: Bug: ERROR: invalid cache ID: 42 CONTEXT: parallel worker