Re: postgres_fdw: Use COPY to speed up batch inserts

From: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: postgres_fdw: Use COPY to speed up batch inserts
Date: 2025-10-17 09:28:44
Message-ID: CAKZiRmyMRYDMUdssdeBp3WRM0qB37jEeCd-0kSOxtemMLCAOiA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 16, 2025 at 10:42 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> Thanks for the patch. Please add it to the next committfest (PG19-3) at

Hi Matheus! same here - thanks for the patch!

> > The attached patch uses the COPY command whenever we have a *numSlots >
> > 1 but the tests show that maybe we should have a GUC to enable this?
> >
>
> I can imagine having a GUC for testing, but it's not strictly necessary.

Just note, I've played maybe like 20mins with this patch and it works,
however if we would like to have yet another GUCs then we would need
to enable two of those? (enable batch_size and this hypothetical
`batch_use_copy`?)

Some other stuff I've tried to cover:
1. how this works with INSERT RETURNING -> as per patch it fallbacks
from COPY to INSERT as expected
2. how this works with INSERT ON CONFLICT -> well, we cannot have
constraints on postgres_fdw, so it is impossible
3. how this works with MERGE -> well, MERGE doesnt work with postgres_fdw
4. I've found that big rows don't play with COPY feature without
memory limitation, so probably some special handling should be done
here, it's nonsense , but:

postgres(at)postgres:1236 : 15836 # INSERT INTO local_t1 (id, t)
SELECT s, repeat(md5(s::text), 10000000) from generate_series(100,
103) s;
2025-10-17 11:17:08.742 CEST [15836] LOG: statement: INSERT INTO
local_t1 (id, t) SELECT s, repeat(md5(s::text), 10000000) from
generate_series(100, 103) s;
2025-10-17 11:17:08.743 CEST [15838] LOG: statement: START
TRANSACTION ISOLATION LEVEL REPEATABLE READ
2025-10-17 11:17:38.302 CEST [15838] LOG: statement: COPY
public.t1(id, t, counter) FROM STDIN (FORMAT TEXT, DELIMITER ',')
ERROR: string buffer exceeds maximum allowed length (1073741823 bytes)
DETAIL: Cannot enlarge string buffer containing 960000028 bytes
by 320000000 more bytes.
2025-10-17 11:17:40.213 CEST [15836] ERROR: string buffer exceeds
maximum allowed length (1073741823 bytes)
2025-10-17 11:17:40.213 CEST [15836] DETAIL: Cannot enlarge
string buffer containing 960000028 bytes by 320000000 more bytes.
2025-10-17 11:17:40.213 CEST [15836] STATEMENT: INSERT INTO
local_t1 (id, t) SELECT s, repeat(md5(s::text), 10000000) from
generate_series(100, 103) s;

but then it never wants to finish that backend (constant loop[ in
PQCleanup() or somewhere close to that), server behaves unstable.
Without batch_size set the very same INSERT behaves OK.

Regards,
-J.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2025-10-17 09:34:16 Re: Logical Replication of sequences
Previous Message Daniel Gustafsson 2025-10-17 09:22:25 Re: Minor spelling fix in memnodes.h