| From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
|---|---|
| To: | KAZAR Ayoub <ma_kazar(at)esi(dot)dz> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Neil Conway <neil(dot)conway(at)gmail(dot)com>, Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, Mark Wong <markwkm(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
| Subject: | Re: Speed up COPY TO text/CSV parsing using SIMD |
| Date: | 2026-03-26 21:09:23 |
| Message-ID: | acWgg0cG7MALI2hB@nathan |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Mar 18, 2026 at 12:02:28AM +0100, KAZAR Ayoub wrote:
> Test Master v3 v3_var v3_var_noinl
> TEXT clean 1504ms -24.1% -23.0% -21.5%
> CSV clean 1760ms -34.9% -32.7% -33.0%
Nice!
> TEXT 1/3 backslashes 3763ms +4.6% +6.9% +4.1%
> CSV 1/3 quotes 3885ms +3.1% +2.7% -0.8%
Hm. These seem a little bit beyond what we could ignore as noise.
> Wide table TEXT (integer columns):
>
> Cols Master v3 v3_var v3_var_noinl
> 50 2083ms -0.7% -0.6% +3.5%
> 100 4094ms -0.1% -0.5% +4.5%
> 200 1560ms +0.6% -2.3% +3.2%
> 500 1905ms -1.0% -1.3% +4.7%
> 1000 1455ms +1.8% +0.4% +4.3%
These numbers look roughly within the noise range.
> Wide table CSV:
>
> Cols Master v3 v3_var v3_var_noinl
> 50 2421ms +4.0% +6.7% +5.8%
Hm. Is this reproducible? A 4% regression is a bit worrisome.
> 100 4980ms +0.1% +2.0% +0.1%
> 200 1901ms +1.4% +3.5% +1.4%
> 500 2328ms +1.8% +2.7% +2.2%
> 1000 1815ms +2.0% +2.8% +2.5%
These numbers don't bother me too much, but maybe there are some ways to
minimize the regressions further.
--
nathan
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2026-03-26 21:23:48 | Re: Speed up COPY TO text/CSV parsing using SIMD |
| Previous Message | Alexandre Felipe | 2026-03-26 20:49:57 | Re: SLOPE - Planner optimizations on monotonic expressions. |