Re: Speed up COPY TO text/CSV parsing using SIMD

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: KAZAR Ayoub <ma_kazar(at)esi(dot)dz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Neil Conway <neil(dot)conway(at)gmail(dot)com>, Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, Mark Wong <markwkm(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: Speed up COPY TO text/CSV parsing using SIMD
Date: 2026-03-26 21:09:23
Message-ID: acWgg0cG7MALI2hB@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 18, 2026 at 12:02:28AM +0100, KAZAR Ayoub wrote:
> Test Master v3 v3_var v3_var_noinl
> TEXT clean 1504ms -24.1% -23.0% -21.5%
> CSV clean 1760ms -34.9% -32.7% -33.0%

Nice!

> TEXT 1/3 backslashes 3763ms +4.6% +6.9% +4.1%
> CSV 1/3 quotes 3885ms +3.1% +2.7% -0.8%

Hm. These seem a little bit beyond what we could ignore as noise.

> Wide table TEXT (integer columns):
>
> Cols Master v3 v3_var v3_var_noinl
> 50 2083ms -0.7% -0.6% +3.5%
> 100 4094ms -0.1% -0.5% +4.5%
> 200 1560ms +0.6% -2.3% +3.2%
> 500 1905ms -1.0% -1.3% +4.7%
> 1000 1455ms +1.8% +0.4% +4.3%

These numbers look roughly within the noise range.

> Wide table CSV:
>
> Cols Master v3 v3_var v3_var_noinl
> 50 2421ms +4.0% +6.7% +5.8%

Hm. Is this reproducible? A 4% regression is a bit worrisome.

> 100 4980ms +0.1% +2.0% +0.1%
> 200 1901ms +1.4% +3.5% +1.4%
> 500 2328ms +1.8% +2.7% +2.2%
> 1000 1815ms +2.0% +2.8% +2.5%

These numbers don't bother me too much, but maybe there are some ways to
minimize the regressions further.

--
nathan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2026-03-26 21:23:48 Re: Speed up COPY TO text/CSV parsing using SIMD
Previous Message Alexandre Felipe 2026-03-26 20:49:57 Re: SLOPE - Planner optimizations on monotonic expressions.