| From: | Manni Wood <manni(dot)wood(at)enterprisedb(dot)com> |
|---|---|
| To: | KAZAR Ayoub <ma_kazar(at)esi(dot)dz> |
| Cc: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Mark Wong <markwkm(at)gmail(dot)com>, Neil Conway <neil(dot)conway(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Speed up COPY FROM text/CSV parsing using SIMD |
| Date: | 2026-02-04 15:07:07 |
| Message-ID: | CAKWEB6r1a3yacHx8bAM3qfpUps4=rm+uUC7JxBfx3P5J0r_SdA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Feb 4, 2026 at 8:29 AM KAZAR Ayoub <ma_kazar(at)esi(dot)dz> wrote:
> Hello,
>
> On Wed, Feb 4, 2026, 6:38 AM Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>
> wrote:
>
>> The 0001-COPY-from-SIMD-v3-with-line_buf-periodic-refill.patch seems
>> nice! On My x86 PC, it had the usual performance improvment of earlier
>> patches, but the regression seemed more similar for both text and csv
>> inputs. Unfortunately, the regression is about 2.5%, but maybe that is an
>> acceptable worst-case for an improvement of 22% for text inputs and 33% for
>> CSV inputs?
>>
>> The 0001-COPY-from-SIMD-v3-with-line_buf-periodic-refill.patch looks even
>> better on my Raspberry Pi's arm processor: not only do we see a 22%
>> improvement for text and an almost 34% improvement for CSV, even the
>> worst-case scenarios show an almost 4% improvement for text and an 11.7%
>> improvement for CSV.
>>
>> By comparison,
>> the v5.1-0001-Simple-heuristic-for-SIMD-COPY-FROM.patch.patch's worst-case
>> performance is poorer on both architectures.
>>
>> I'd be curious to know if anyone else can reproduces these
>> numbers. 0001-COPY-from-SIMD-v3-with-line_buf-periodic-refill.patch seems
>> like a real winner.
>>
> Thanks for the benchmark Manni, i suppose this is the same threshold as
> patch has (4096 bytes), have you tried any bigger values for the threshold
> ?
> Because i'm still expecting less l1d cache misses and execution times the
> more we increase the threshold (relatively to l1d cache size per core).
> As per my previous not-so-stable numbers 28KB wasn't too bad.
>
>
> Regards,
> Ayoub
>
Ah, thanks for the prod, Ayoub. You are correct: The results in my previous
e-mail for the 0001-COPY-from-SIMD-v3-with-line_buf-periodic-refill.patch
patch are with LINE_BUF_FLUSH_AFTER set to its default of 4096. I will try
to measure what happens for larger LINE_BUF_FLUSH_AFTER values, hopefully
some time this week.
Best,
-Manni
--
-- Manni Wood EDB: https://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bryan Green | 2026-02-04 15:08:38 | Re: [PATCH] Fix severe performance regression with gettext 0.20+ on Windows |
| Previous Message | Peter Eisentraut | 2026-02-04 15:06:12 | Re: Docs: Use non-default throughout the documentation |