Re: postgres_fdw: Use COPY to speed up batch inserts

From: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: postgres_fdw: Use COPY to speed up batch inserts
Date: 2026-03-04 12:17:21
Message-ID: 72ec1708-0c02-4ae9-b4f5-ee2bac5fd2f3@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the review.

On 03/03/26 16:47, Masahiko Sawada wrote:
> Thank you for updating the patch! Here are some review comments:
>
> + Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);
> +
> + if (attr->attgenerated)
> + continue;
> +
> + if (!first)
> + appendStringInfoString(buf, ", ");
> +
> + first = false;
> +
> + appendStringInfoString(buf,
> quote_identifier(NameStr(attr->attname)));
>
> We need to take care of the 'column_name' option here. If it's set we
> should not use attr->attname as it is.
>

Fixed

> --
> + }
> + if (nattrs > 0)
> + appendStringInfoString(buf, ") FROM STDIN");
> + else
> + appendStringInfoString(buf, " FROM STDIN");
> +}
>
> It might be better to explicitly specify the format 'text'.
>

Fixed

> ---
> + if (useCopy)
> + {
> + deparseCopySql(&sql, rel, targetAttrs);
> + values_end_len = 0; /* Keep compiler quiet */
> + }
> + else
> + deparseInsertSql(&sql, rte, resultRelation, rel,
> targetAttrs, doNothing,
> +
> resultRelInfo->ri_WithCheckOptions,
> +
> resultRelInfo->ri_returningList,
> + &retrieved_attrs,
> &values_end_len);
>
> I think we should consider whether it's okay to use the COPY command
> even if resultRelInfo->ri_WithCheckOptions is non-NULL. As far as I
> researched, it's okay as we currently don't support COPY to a view but
> please consider it as well. We might want to explain it too in the
> comment.
>

Good point, fixed.

> How about initializing values_end_len with 0 at its declaration?
>

Fixed

> ---
> +-- test that fdw also use COPY FROM as a remote sql
> +set client_min_messages to 'log';
> +
> +create function insert_or_copy() returns trigger as $$
> +declare query text;
> +begin
> + query := current_query();
> + if query ~* '^COPY' then
> + raise notice 'COPY command';
> + elsif query ~* '^INSERT' then
> + raise notice 'INSERT command';
> + end if;
> +return new;
> +end;
> +$$ language plpgsql;
>
> On second thoughts, it might be okay to output the current_query() as it is.
>

Fixed

> ---
> +copy grem1 from stdin;
> +3
> +\.
>
> I think it would be good to have more tests, for example, checking if
> the COPY command method can work properly with batch_size and
> column_name options.
>

I've added new test cases for these cases. To test the batch_size case
I've added an elog(DEBUG1) because using the trigger with
current_query() log an entry for each row that we send for the foreign
server, with the elog(DEBUG1) we can expect one log message for each
batch operation. Please let me know if there is better ways to do this.

Please see the new attached version.

--
Matheus Alcantara
EDB: https://www.enterprisedb.com

Attachment Content-Type Size
v12-0001-postgres_fdw-Use-COPY-as-remote-SQL-when-possibl.patch text/plain 15.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2026-03-04 12:52:00 Re: pg_waldump: support decoding of WAL inside tarfile
Previous Message Jim Jones 2026-03-04 11:35:17 Re: POC: PLpgSQL FOREACH IN JSON ARRAY