Re: Fast COPY FROM based on batch insert

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, tanghy(dot)fnst(at)fujitsu(dot)com, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, houzj(dot)fnst(at)fujitsu(dot)com
Subject: Re: Fast COPY FROM based on batch insert
Date: 2022-08-10 08:30:20
Message-ID: CAPmGK17pyQgsz9zy9T4avRNw8TY74T5-v1n0vSUgSn8tmjGgVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Aug 10, 2022 at 1:06 AM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> On Tue, Aug 9, 2022 at 4:45 AM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
>> * When running AFTER ROW triggers in CopyMultiInsertBufferFlush(), the
>> patch uses the slots passed to ExecForeignBatchInsert(), not the ones
>> returned by the callback function, but I don't think that that is
>> always correct, as the documentation about the callback function says:
>>
>> The return value is an array of slots containing the data that was
>> actually inserted (this might differ from the data supplied, for
>> example as a result of trigger actions.)
>> The passed-in <literal>slots</literal> can be re-used for this purpose.
>>
>> postgres_fdw re-uses the passed-in slots, but other FDWs might not, so
>> I modified the patch to reference the returned slots when running the
>> AFTER ROW triggers.

I noticed that my explanation was not correct. Let me explain.
Before commit 82593b9a3, when batching into a view referencing a
postgres_fdw foreign table that has WCO constraints, postgres_fdw used
the passed-in slots to store the first tuple that was actually
inserted to the remote table. But that commit disabled batching in
that case, so postgres_fdw wouldn’t use the passed-in slots (until we
support batching when there are WCO constraints from the parent views
and/or AFTER ROW triggers on the foreign table).

> + /* If any rows were inserted, run AFTER ROW INSERT triggers. */
> ...
> + for (i = 0; i < inserted; i++)
> + {
> + TupleTableSlot *slot = rslots[i];
> ...
> + slot->tts_tableOid =
> + RelationGetRelid(resultRelInfo->ri_RelationDesc);
>
> It seems the return value of `RelationGetRelid(resultRelInfo->ri_RelationDesc)` can be stored in a variable outside the for loop.
> Inside the for loop, assign this variable to slot->tts_tableOid.

Actually, I did this to match the code in ExecBatchInsert(), but that
seems like a good idea, so I’ll update the patch as such in the next
version.

Thanks for reviewing!

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-08-10 08:30:45 Re: Fix a typo in pgstatfuncs.c
Previous Message Daniel Gustafsson 2022-08-10 08:28:06 Re: createuser doesn't tell default settings for some options