Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Amit Kapila <akapila(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode
Date: 2021-03-24 04:30:49
Message-ID: 1030301.1616560249@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Mar 23, 2021 at 3:13 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>> You cache it.

> Yeah, exactly. I don't think it's super-easy to understand exactly how
> to make that work well for something like this. It would be easy
> enough to set a flag in the relcache whose value is computed the first
> time we need it and is then consulted every time after that, and you
> just invalidate it based on sinval messages. But, if you go with that
> design, you've got a big problem: now an insert has to lock all the
> tables in the partitioning hierarchy to decide whether it can run in
> parallel or not, and we do not want that.

Possibly-crazy late-night idea ahead:

IIUC, we need to know a global property of a partitioning hierarchy:
is every trigger, CHECK constraint, etc that might be run by an INSERT
parallel-safe? What we see here is that reverse-engineering that
property every time we need to know it is just too expensive, even
with use of our available caching methods.

How about a declarative approach instead? That is, if a user would
like parallelized inserts into a partitioned table, she must declare
the table parallel-safe with some suitable annotation. Then, checking
the property during DML is next door to free, and instead we have to think
about whether and how to enforce that the marking is valid during DDL.

I don't honestly see a real cheap way to enforce such a property.
For instance, if someone does ALTER FUNCTION to remove a function's
parallel-safe marking, we can't really run around and verify that the
function is not used in any CHECK constraint. (Aside from the cost,
there would be race conditions.)

But maybe we don't have to enforce it exactly. It could be on the
user's head that the marking is accurate. We could prevent any
really bad misbehavior by having parallel workers error out if they
see they've been asked to execute a non-parallel-safe function.

Or there are probably other ways to slice it up. But I think some
outside-the-box thinking might be helpful here.

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Amit Kapila 2021-03-24 04:44:53 Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode
Previous Message Andres Freund 2021-03-24 04:24:29 Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode