Re: Speed up COPY FROM text/CSV parsing using SIMD

From: KAZAR Ayoub <ma_kazar(at)esi(dot)dz>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: 2025-11-26 11:50:58
Message-ID: CA+K2Rump8NoMRZRZ2r4jHXUJwByasy_c3_b0oaO+TLkSbMD-jw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,
On Wed, Nov 19, 2025 at 10:01 PM Nathan Bossart <nathandbossart(at)gmail(dot)com>
wrote:

> On Tue, Nov 18, 2025 at 05:20:05PM +0300, Nazir Bilal Yavuz wrote:
> > Thanks, done.
>
> I took a look at the v3 patches. Here are my high-level thoughts:
>
> + /*
> + * Parse data and transfer into line_buf. To get benefit from
> inlining,
> + * call CopyReadLineText() with the constant boolean variables.
> + */
> + if (cstate->simd_continue)
> + result = CopyReadLineText(cstate, is_csv, true);
> + else
> + result = CopyReadLineText(cstate, is_csv, false);
>
> I'm curious whether this actually generates different code, and if it does,
> if it's actually faster. We're already branching on cstate->simd_continue
> here.

I've compiled both versions with -O2 and confirmed they generate different
code. When simd_continue is passed as a constant to CopyReadLineText, the
compiler optimizes out the condition checks from the SIMD path.
A small benchmark on a 1GB+ file shows the expected benefit which is around
6% performance improvement.
I've attached the assembly outputs in case someone wants to check something
else.

Regards,
Ayoub Kazar

Attachment Content-Type Size
copyfromparse-constant.asm application/octet-stream 48.0 KB
copyfromparse-variable.asm application/octet-stream 47.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-11-26 11:55:03 Re: transformJsonFuncExpr pathspec cache lookup failed
Previous Message Amit Kapila 2025-11-26 11:46:51 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart