| From: | KAZAR Ayoub <ma_kazar(at)esi(dot)dz> |
|---|---|
| To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
| Cc: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Speed up COPY FROM text/CSV parsing using SIMD |
| Date: | 2025-11-26 11:50:58 |
| Message-ID: | CA+K2Rump8NoMRZRZ2r4jHXUJwByasy_c3_b0oaO+TLkSbMD-jw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
On Wed, Nov 19, 2025 at 10:01 PM Nathan Bossart <nathandbossart(at)gmail(dot)com>
wrote:
> On Tue, Nov 18, 2025 at 05:20:05PM +0300, Nazir Bilal Yavuz wrote:
> > Thanks, done.
>
> I took a look at the v3 patches. Here are my high-level thoughts:
>
> + /*
> + * Parse data and transfer into line_buf. To get benefit from
> inlining,
> + * call CopyReadLineText() with the constant boolean variables.
> + */
> + if (cstate->simd_continue)
> + result = CopyReadLineText(cstate, is_csv, true);
> + else
> + result = CopyReadLineText(cstate, is_csv, false);
>
> I'm curious whether this actually generates different code, and if it does,
> if it's actually faster. We're already branching on cstate->simd_continue
> here.
I've compiled both versions with -O2 and confirmed they generate different
code. When simd_continue is passed as a constant to CopyReadLineText, the
compiler optimizes out the condition checks from the SIMD path.
A small benchmark on a 1GB+ file shows the expected benefit which is around
6% performance improvement.
I've attached the assembly outputs in case someone wants to check something
else.
Regards,
Ayoub Kazar
| Attachment | Content-Type | Size |
|---|---|---|
| copyfromparse-constant.asm | application/octet-stream | 48.0 KB |
| copyfromparse-variable.asm | application/octet-stream | 47.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2025-11-26 11:55:03 | Re: transformJsonFuncExpr pathspec cache lookup failed |
| Previous Message | Amit Kapila | 2025-11-26 11:46:51 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |