Quick Links

Re: Speed up COPY TO text/CSV parsing using SIMD

From:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
To:	KAZAR Ayoub <ma_kazar(at)esi(dot)dz>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Neil Conway <neil(dot)conway(at)gmail(dot)com>, Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, Mark Wong <markwkm(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject:	Re: Speed up COPY TO text/CSV parsing using SIMD
Date:	2026-03-31 16:30:54
Message-ID:	acv2vu8miagnHG1B@nathan
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Mar 27, 2026 at 07:48:38PM +0100, KAZAR Ayoub wrote:
> I added a prescan loop inside the simd helpers trying to catch special
> chars in sizeof(Vector8) characters, i measured how good is this at
> reducing the overhead of starting simd and exiting at first vector:
> the scalar loop is better than SIMD for one vector if it finds a special
> character before 6th character, worst case is not a clean vector, where the
> scalar loop needs 20 more cycles compared to SIMD.
> This helps mitigate the case of JSON(B) in CSV format, this is why I only
> added this for CSV case only.

Interesting.

> In a benchmark with 10M early SIMD exit like the JSONB case, the previous
> 3% regression is gone.

While these are nice results, I think it's best that we target v20 for this
patch so that we have more time to benchmark and explore edge cases.

--
nathan

In response to

Re: Speed up COPY TO text/CSV parsing using SIMD at 2026-03-27 18:48:38 from KAZAR Ayoub

Responses

Re: Speed up COPY TO text/CSV parsing using SIMD at 2026-04-02 18:07:38 from KAZAR Ayoub

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Sami Imseih	2026-03-31 16:41:08	Re: Add pg_stat_autovacuum_priority
Previous Message	Nathan Bossart	2026-03-31 16:28:18	Re: Add pg_stat_autovacuum_priority