Re: Make COPY format extendable: Extract COPY TO format implementations

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Sutou Kouhei <kou(at)clear-code(dot)com>, andres(at)anarazel(dot)de
Cc: michael(at)paquier(dot)xyz, david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, zhjwpku(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2025-11-17 17:04:46
Message-ID: c36d218a-bb38-42b9-9076-cb75b8984a39@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/14/25 21:19, Masahiko Sawada wrote:
> After offline discussions with Sutou-san, we believe the current APIs
> work well, particularly for text-based formats, though we still need
> to verify there are no performance regressions.

I got pinged about this patch off-list. I won't have capacity to do a
proper review, anytime soon, but I got a bit of time to do a simple
benchmark (which seems useful as that was one of the concerns in this
thread, it seems).

Attached is a script that does COPY TO/FROM with the built-in formats,
on table with 1, 10 and 100 integer columns. The data sets are between
10 and 1M rows. The table is UNLOGGED, to eliminate WAL overhead.

The attached PDF summarizes results from my ryzen machine, for master
and patched build. The final columns are comparison (i.e. copy/master)
of the timings. Values >100% are regressions (marked as red).

It seems quite "red", but it's not particularly conclusive. The
differences are mostly within 5%, and that could be caused e.g. by
changes to binary layout. And some of the cases got faster too.

It might be interesting to get results from other machines. The script
may need some adjustments, but it should be too difficult.

The other thing Andres was concerned about is "the amount of COPY
performance improvements it forecloses". I have no opinion on that, as
it depends on what improvements Andres envisioned.

regards

--
Tomas Vondra

Attachment Content-Type Size
copy.sh application/x-shellscript 1.9 KB
copy.pdf application/pdf 46.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mahendra Singh Thalor 2025-11-17 17:15:28 Re: Non-text mode for pg_dumpall
Previous Message Nathan Bossart 2025-11-17 16:44:43 Re: [PATCH] Add hints for invalid binary encoding names in encode/decode functions