Re: postgres_fdw: Use COPY to speed up batch inserts

From: "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com>
To: "Masahiko Sawada" <sawada(dot)mshk(at)gmail(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "jian he" <jian(dot)universality(at)gmail(dot)com>, "Tomas Vondra" <tomas(at)vondra(dot)me>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: postgres_fdw: Use COPY to speed up batch inserts
Date: 2026-04-01 15:50:14
Message-ID: DHHXRI5DG48O.E94UR7AEJXS9@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon Mar 30, 2026 at 4:14 PM -03, Masahiko Sawada wrote:
>> Please see the new attached version.
>
> I've reviewed the v12 patch, and here are review comments:
>
> The COPY data sent via postgres_fdw should properly escape the input
> data. The bug I found can be reproduced with the following scenario:
>
> -- on local server
> create server remote foreign data wrapper postgres_fdw;
> create foreign table t (a int, b text) server remote;
> create user mapping for public server remote;
>
> -- on remote server
> create table t (a int, b text);
>
> -- on local server
> copy t(a, d) from stdin;
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself, or an EOF signal.
>>> 1 hello\nworld
>>> \.
> ERROR: invalid input syntax for type integer: "world"
> CONTEXT: COPY t, line 2, column a: "world"
> remote SQL command: COPY public.t(a, d) FROM STDIN (FORMAT TEXT)
>

I think that we need something like CopyAttributeOutText() here.

To fix this I've added appendStringInfoText() which is a similar version
of CopyAttributeOutText() that works with a StringInfo. I did not find
any function that I could reuse here, if such function exists please let
me know.

I'm wondering if we should have this similar function or try to combine
both to avoid duplicated logic, although it looks complicated to me at
first look to combine these both usages.

Another option is to use the BINARY format, but it is less portable
compared to the TEXT format across machines architectures and
PostgreSQL versions [1]. For CSV we can just wrap the string into ' but
I think that we can have a performance issue. What do you think?

I've also quickly benchmarked this change and I've got very similar
execution time, with and without this change.

> ---
> I think the patch misses calling set_transmission_modes() and
> reset_transmission_modes() when converting the tuple slots. A
> problematic scenario is:
>
> -- on local server
> create server remote foreign data wrapper postgres_fdw;
> create foreign table f (a float) server remote;
> create user mapping for public server remote;
>
> -- on remote server
> create table f (a float);
>
> -- on local server
> set extra_float_digits = 0;
> copy test_float from stdin;
> 1.0000000000000002
> \.
> select * from f;
> a
> ----
> 1
> (1 row)
>

IIUC the correct output on this case should still be 1 if we did not
reset extra_float_digits, right?

postgres=# create table xxx(a float);
CREATE TABLE
postgres=# set extra_float_digits = 0;
SET
postgres=# copy xxx from stdin;
1.0000000000000002
\.
COPY 1
postgres=# select * from xxx;
a
---
1
(1 row)

postgres=# reset extra_float_digits;
RESET
postgres=# select * from xxx;
a
--------------------
1.0000000000000002
(1 row)

But you are right that set_transmission_modes() call is missing and
without it we will send "1" for the foreign server and it will lost the
actual value. I've fixed this on the new version.

> ---
> +
> + appendStringInfo(buf, "COPY ");
> + deparseRelation(buf, rel);
> + if (nattrs > 0)
> + appendStringInfo(buf, "(");
>
> appendStringInfoString() or appendStringInfoChar() should be used instead.
>

Fixed.

Thank you for reviewing this patch!

[1] https://www.postgresql.org/docs/current/sql-copy.html#SQL-COPY-FILE-FORMATS

--
Matheus Alcantara
EDB: https://www.enterprisedb.com

Attachment Content-Type Size
v13-0001-postgres_fdw-Use-COPY-as-remote-SQL-when-possibl.patch text/plain 19.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2026-04-01 15:51:19 Re: index prefetching
Previous Message Andres Freund 2026-04-01 15:49:30 Re: AIO / read stream heuristics adjustments for index prefetching