Re: POC: postgres_fdw insert batching

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: postgres_fdw insert batching
Date: 2021-01-13 14:43:29
Message-ID: 131a695b-9010-6751-9ea7-3dabe3555f8c@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/13/21 10:15 AM, Amit Langote wrote:
> Hi Tomas, Tsunakawa-san,
>
> Thanks for your work on this.
>
> On Tue, Jan 12, 2021 at 11:06 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>> AFAICS the discussions about making this use COPY and/or libpq
>> pipelining (neither of which is committed yet) ended with the conclusion
>> that those changes are somewhat independent, and that it's worth getting
>> this committed in the current form. Barring objections, I'll push this
>> within the next couple days.
>
> I was trying this out today (been meaning to do so for a while) and
> noticed that this fails when there are AFTER ROW triggers on the
> foreign table. Here's an example:
>
> create extension postgres_fdw ;
> create server lb foreign data wrapper postgres_fdw ;
> create user mapping for current_user server lb;
> create table p (a numeric primary key);
> create foreign table fp (a int) server lb options (table_name 'p');
> create function print_row () returns trigger as $$ begin raise notice
> '%', new; return null; end; $$ language plpgsql;
> create trigger after_insert_trig after insert on fp for each row
> execute function print_row();
> insert into fp select generate_series (1, 10);
> <crashes>
>
> Apparently, the new code seems to assume that batching wouldn't be
> active when the original query contains RETURNING clause but some
> parts fail to account for the case where RETURNING is added to the
> query to retrieve the tuple to pass to the AFTER TRIGGER.
> Specifically, the Assert in the following block in
> execute_foreign_modify() is problematic:
>
> /* Check number of rows affected, and fetch RETURNING tuple if any */
> if (fmstate->has_returning)
> {
> Assert(*numSlots == 1);
> n_rows = PQntuples(res);
> if (n_rows > 0)
> store_returning_result(fmstate, slots[0], res);
> }
>

Thanks for the report. Yeah, I think there's a missing check in
ExecInsert. Adding

(!resultRelInfo->ri_TrigDesc->trig_insert_after_row)

solves this. But now I'm wondering if this is the wrong place to make
this decision. I mean, why should we make the decision here, when the
decision whether to have a RETURNING clause is made in postgres_fdw in
deparseReturningList? We don't really know what the other FDWs will do,
for example.

So I think we should just move all of this into GetModifyBatchSize. We
can start with ri_BatchSize = 0. And then do

if (resultRelInfo->ri_BatchSize == 0)
resultRelInfo->ri_BatchSize =
resultRelInfo->ri_FdwRoutine->GetModifyBatchSize(resultRelInfo);

if (resultRelInfo->ri_BatchSize > 1)
{
... do batching ...
}

The GetModifyBatchSize would always return value > 0, so either 1 (no
batching) or >1 (batching).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Add-bulk-insert-for-foreign-tables-v7.patch text/x-patch 48.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sehrope Sarkuni 2021-01-13 15:00:49 Re: Moving other hex functions to /common
Previous Message Fujii Masao 2021-01-13 14:28:55 Re: outdated references to replication timeout