Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-22 06:58:19
Message-ID: CAJcOf-dScTXbOBiDv4H0sbaNB1e+sbu-yKCukT9dHYLduQTwug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 22, 2021 at 2:30 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> I noticed that some comments may need updated since we introduced parallel insert in this patch.
>
> 1) src/backend/executor/execMain.c
> * Don't allow writes in parallel mode. Supporting UPDATE and DELETE
> * would require (a) storing the combocid hash in shared memory, rather
> * than synchronizing it just once at the start of parallelism, and (b) an
> * alternative to heap_update()'s reliance on xmax for mutual exclusion.
> * INSERT may have no such troubles, but we forbid it to simplify the
> * checks.
>
> As we will allow INSERT in parallel mode, we'd better change the comment here.
>

Thanks, it does need to be updated for parallel INSERT.
I was thinking of the following change:

- * Don't allow writes in parallel mode. Supporting UPDATE and DELETE
- * would require (a) storing the combocid hash in shared memory, rather
- * than synchronizing it just once at the start of parallelism, and (b) an
- * alternative to heap_update()'s reliance on xmax for mutual exclusion.
- * INSERT may have no such troubles, but we forbid it to simplify the
- * checks.
+ * Except for INSERT, don't allow writes in parallel mode. Supporting
+ * UPDATE and DELETE would require (a) storing the combocid hash in shared
+ * memory, rather than synchronizing it just once at the start of
+ * parallelism, and (b) an alternative to heap_update()'s reliance on xmax
+ * for mutual exclusion.

> 2) src/backend/storage/lmgr/README
> dangers are modest. The leader and worker share the same transaction,
> snapshot, and combo CID hash, and neither can perform any DDL or, indeed,
> write any data at all. Thus, for either to read a table locked exclusively by
>
> The same as 1), parallel insert is the exception.
>

I agree, it needs to be updated too, to account for parallel INSERT
now being supported.

-write any data at all. ...
+write any data at all (with the exception of parallel insert). ...

> 3) src/backend/storage/lmgr/README
> mutual exclusion method for such cases. Currently, the parallel mode is
> strictly read-only, but now we have the infrastructure to allow parallel
> inserts and parallel copy.
>
> May be we can say:
> +mutual exclusion method for such cases. Currently, we only allowed parallel
> +inserts, but we already have the infrastructure to allow parallel copy.
>

Yes, agree, something like:

-mutual exclusion method for such cases. Currently, the parallel mode is
-strictly read-only, but now we have the infrastructure to allow parallel
-inserts and parallel copy.
+mutual exclusion method for such cases. Currently, only parallel insert is
+allowed, but we have the infrastructure to allow parallel copy.

Let me know if these changes seem OK to you.

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-03-22 07:27:18 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Masahiko Sawada 2021-03-22 06:49:23 Re: Replication slot stats misgivings