Re: postgres_fdw: Use COPY to speed up batch inserts

From: "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com>
To: "jian he" <jian(dot)universality(at)gmail(dot)com>, "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com>
Cc: "Tomas Vondra" <tomas(at)vondra(dot)me>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgres_fdw: Use COPY to speed up batch inserts
Date: 2025-10-24 18:27:01
Message-ID: DDQRIXD8K0Z4.YO9AP5L8F4TI@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, thanks for testing this patch!

On Thu Oct 23, 2025 at 6:49 AM -03, jian he wrote:
> On Thu, Oct 23, 2025 at 8:01 AM Matheus Alcantara
> <matheusssilv97(at)gmail(dot)com> wrote:
>>
>> Please see the attached v3 version that implements this idea.
>>
> hi.
>
> I am not famailith with this module.
> some of the foreach can be replaced with foreach_int.
>
Fixed.

> I suspect that somewhere Form_pg_attribute.attisdropped is not handled properly.
> the following setup will crash.
>
> ---source database
> drop table batch_table1;
> create table batch_table1(x int);
>
> ---foreign table database
> drop foreign table if exists ftable1;
> CREATE FOREIGN TABLE ftable1 ( x int ) SERVER loopback1 OPTIONS (
> table_name 'batch_table1', batch_size '10' );
> ALTER FOREIGN TABLE ftable1 DROP COLUMN x;
> ALTER FOREIGN TABLE ftable1 add COLUMN x int;
>
> INSERT INTO ftable SELECT * FROM generate_series(1, 10) i; --- this
> will cause server crash.
>
I've tested this scenario and the field attisdropped is still being set
to false. After some debugging I realize that the problem was how I was
accessing the fmstate->p_flinfo array - I was using the attum-1 but I
don't think that it's correct.

On create_foreign_modify() we have the following code:

fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
fmstate->p_nums = 0;
if (operation == CMD_INSERT || operation == CMD_UPDATE)
{
/* Set up for remaining transmittable parameters */
foreach(lc, fmstate->target_attrs)
{
int attnum = lfirst_int(lc);
Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);

Assert(!attr->attisdropped);

/* Ignore generated columns; they are set to DEFAULT */
if (attr->attgenerated)
continue;
getTypeOutputInfo(attr->atttypid, &typefnoid, &isvarlena);
fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
fmstate->p_nums++;
}
}

So I think that I should access fmstate->p_flinfo array when looping
through the target_attrs using an int value starting at 0 and ++ after
each iteration. Although I'm not sure if my understanding is fully
correct I've implemented this on the attached patch and it seems to fix
the error.

On this new version I also added some regress tests on postgres_fdw.sql

--
Matheus Alcantara

Attachment Content-Type Size
v4-0001-postgres_fdw-Use-COPY-to-speed-up-batch-inserts.patch text/plain 16.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-10-24 18:35:41 Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array
Previous Message Masahiko Sawada 2025-10-24 18:23:23 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions