Re: Parallel copy

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Ants Aasma <ants(at)cybertec(dot)at>, vignesh C <vignesh21(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alastair Turner <minion(at)decodable(dot)me>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-05-14 06:17:59
Message-ID: CAA4eK1LW3tWC0C-UsHU=53hgXKn0pynNogaCGWeD5+PyyngOpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 14, 2020 at 12:39 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, May 12, 2020 at 1:01 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > I don't understand why we need to do something special for combo CIDs
> > if they are not generated during this operation?
>
> Hmm. Well I guess if they're not being generated then we don't need to
> do anything about them, but I still think we should try to work around
> having to disable parallelism for a table which is referenced by
> foreign keys.
>

Okay, just to be clear, we want to allow parallelism for a table that
has foreign keys. Basically, a parallel copy should work while
loading data into tables having FK references.

To support that, we need to consider a few things.
a. Currently, we increment the command counter each time we take a key
share lock on a tuple during trigger execution. I am really not sure
if this is required during Copy command execution or we can just
increment it once for the copy. If we need to increment the command
counter just once for copy command then for the parallel copy we can
ensure that we do it just once at the end of the parallel copy but if
not then we might need some special handling.

b. Another point is that after inserting rows we record CTIDs of the
tuples in the event queue and then once all tuples are processed we
call FK trigger for each CTID. Now, with parallelism, the FK checks
will be processed once the worker processed one chunk. I don't see
any problem with it but still, this will be a bit different from what
we do in serial case. Do you see any problem with this?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-05-14 06:23:02 Re: PG 13 release notes, first draft
Previous Message Kyotaro Horiguchi 2020-05-14 06:16:59 Re: MultiXact\SLRU buffers configuration