| From: | KAZAR Ayoub <ma_kazar(at)esi(dot)dz> |
|---|---|
| To: | Manni Wood <manni(dot)wood(at)enterprisedb(dot)com> |
| Cc: | Mark Wong <markwkm(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Speed up COPY FROM text/CSV parsing using SIMD |
| Date: | 2026-01-22 18:22:56 |
| Message-ID: | CA+K2RumUD+aJ3vuD+05aDWj6geek5DCPYD5peXrRU41QjtORFA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
On Tue, Jan 20, 2026 at 9:49 PM Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>
wrote:
> Hello, all I have more benchmarks.
>
> These benchmarks are from a Raspberry Pi 5 that I bought. It has an Arm
> Cortex A76 processor.
>
> (I was so impressed with the stability of the results I got on my
> standalone Intel tower PC that I figured I needed a standalone Arm-based
> machine that was not a laptop and not a VM at a cloud service provider. The
> run-to-run results were indeed more stable, just like with my standalone
> tower PC.)
>
> COPY FROM
>
> master: (852558b9)
>
> text, no special: 9111
> text, 1/3 special: 10302
> csv, no special: 11147
> csv, 1/3 special: 13375
>
> v3
>
> text, no special: 7351 (19.3% speedup)
> text, 1/3 special: 10397 (0.9% regression)
> csv, no special: 7272 (34.7% speedup)
> csv, 1/3 special: 13472 (0.7% regression)
>
> v4.2
>
> text, no special: 7300 (19.6% speedup)
> text, 1/3 special: 10537 (2.3% regression)
> csv, no special: 7260 (34.8% speedup)
> csv, 1/3 special: 13881 (3.8% regression)
>
> COPY TO
>
> master: (852558b9)
>
> text, no special: 2446
> text, 1/3 special: 6988
> csv, no special: 2822
> csv, 1/3 special: 6967
>
> v4 (copy to)
>
> text, no special: 1533 (37.3% speedup)
> text, 1/3 special: 5949 (14.8% speedup)
> csv, no special: 1560 (44.7% speedup)
> csv, 1/3 special: 6006 (13.8% speedup)
>
> I find these results particularly exciting because with the COPY FROM v3
> patch, the worst-case scenarios are just under 1% regression. The v4 COPY
> TO patch is a win across the board.
>
> Note that I ran these benchmarks with everything in RAM disk and using the
> cpupower instructions that Nazir suggested.
>
> So on Arm, the v3 COPY FROM patch is almost all upside, and the v4 COPY TO
> patch is all upside. The same is almost true for Intel, but the CSV COPY
> FROM regression, even from the V3 COPY FROM patch, is about 5%. The v4.2
> COPY FROM patch always performs worse than the v3 COPY FROM patch in
> worst-case scenarios.
>
> Does it seem reasonable to stop performance testing the v4.2 COPY FROM
> patch? Have we collected enough benchmark data to be confident that the v3
> COPY FROM patch is the one we should be moving forward with?
>
For the case of v4.2 using the 1/3 specials benchmark, it will always take
the decision to not use SIMD after sampling and that 3%-4% regression is
the combination of the small overhead of counting special characters and
2-4 branches and its effect on the general layout, branch prediction,
pipeline ..etc, while i don't think it's more complex than v3 but this is
the only thing i can think of.
And since it assumes uniformity of special characters between lines so yes
IMHO v3 is generally better.
Regards,
Ayoub
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Álvaro Herrera | 2026-01-22 18:58:05 | Re: Race conditions in logical decoding |
| Previous Message | Andres Freund | 2026-01-22 18:19:43 | Re: Having problems generating a code coverage report |