Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Date: 2021-06-10 17:29:34
Message-ID: CA+TgmoZ=V5a=kuNs+ir3+S7j7x5XkVWVukFn9hgHMg_V3A_G4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 10, 2021 at 12:54 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Fair enough. So, I think there is a consensus to drop this patch and
> if one wants then we can document these cases. Also, we don't want it
> to enable parallelism for Inserts where we are trying to pursue the
> approach to have a flag in pg_class which allows users to specify
> whether writes are allowed on a specified relation.

+1. The question that's still on my mind a little bit is whether
there's a reasonable alternative to forcing users to set a flag
manually. It seems less convenient than having to do the same thing
for a function, because most users probably only create functions
occasionally, but creating tables seems like it's likely to be a more
common operation. Plus, a function is basically a program, so it sort
of feels reasonable that you might need to give the system some hints
about what the program does, but that doesn't apply to a table.

Now, if we forget about partitioned tables here for a moment, I don't
really see why we couldn't do this computation based on the relcache
entry, and then just cache the flag there? I think anything that would
change the state for a plain old table would also cause some
invalidation that we could notice. And I don't think that the cost of
walking over triggers, constraints, etc. and computing the value we
need on demand would be exorbitant.

For a partitioned table, things are a lot more difficult. For one
thing, the cost of computation can be a lot higher; there might be a
thousand or more partitions. For another thing, computing the value
could have scary side effects, like opening all the partitions, which
would also mean taking locks on them and building expensive relcache
entries. For a third thing, we'd have no way of knowing whether the
value was still current, because an event that produces an
invalidation for a partition doesn't necessarily produce any
invalidation for the partitioned table.

So one idea is maybe we only need an explicit flag for partitioned
tables, and regular tables we can just work it out automatically.
Another idea is maybe we try to solve the problems somehow so that it
can also work with partitioned tables. I don't really have a great
idea right at the moment, but maybe it's worth devoting some more
thought to the problem.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-06-10 17:42:47 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Previous Message Matthias van de Meent 2021-06-10 17:29:29 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic