Re: Speed up COPY TO text/CSV parsing using SIMD

From: Andres Freund <andres(at)anarazel(dot)de>
To: KAZAR Ayoub <ma_kazar(at)esi(dot)dz>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Neil Conway <neil(dot)conway(at)gmail(dot)com>, Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, Mark Wong <markwkm(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: Speed up COPY TO text/CSV parsing using SIMD
Date: 2026-02-12 21:25:49
Message-ID: aY5C6Xa5im72NF_Y@alap3.anarazel.de
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-02-12 22:07:52 +0100, KAZAR Ayoub wrote:
> Currently optimizing COPY FROM using SIMD is still under review, but for
> the case of COPY TO using the same ideas, we found that the problem is
> trivial, the attached patch gives very nice speedups as confirmed by
> Manni's benchmarks.

I have a hard time believing that adding a strlen() to the handling of a short
column won't be a measurable overhead with lots of short attributes.
Particularly because the patch afaict will call it repeatedly if there are any
to-be-escaped characters.

I also don't think it's good how much code this repeats. I think you'd have to
start with preparatory moving the exiting code into static inline helper
functions and then introduce SIMD into those.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-02-12 21:46:30 Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible
Previous Message Nathan Bossart 2026-02-12 21:21:54 Re: [PATCH] Support reading large objects with pg_read_all_data