| From: | KAZAR Ayoub <ma_kazar(at)esi(dot)dz> |
|---|---|
| To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Neil Conway <neil(dot)conway(at)gmail(dot)com>, Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, Mark Wong <markwkm(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
| Subject: | Re: Speed up COPY TO text/CSV parsing using SIMD |
| Date: | 2026-04-02 18:07:38 |
| Message-ID: | CA+K2Ru=JK5NUEaxA77pCEer40QnV1TMxeg68Et9RL0zMZw_Jyw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Mar 31, 2026 at 6:30 PM Nathan Bossart <nathandbossart(at)gmail(dot)com>
wrote:
> On Fri, Mar 27, 2026 at 07:48:38PM +0100, KAZAR Ayoub wrote:
> > I added a prescan loop inside the simd helpers trying to catch special
> > chars in sizeof(Vector8) characters, i measured how good is this at
> > reducing the overhead of starting simd and exiting at first vector:
> > the scalar loop is better than SIMD for one vector if it finds a special
> > character before 6th character, worst case is not a clean vector, where
> the
> > scalar loop needs 20 more cycles compared to SIMD.
> > This helps mitigate the case of JSON(B) in CSV format, this is why I only
> > added this for CSV case only.
>
> Interesting.
>
> > In a benchmark with 10M early SIMD exit like the JSONB case, the previous
> > 3% regression is gone.
>
> While these are nice results, I think it's best that we target v20 for this
> patch so that we have more time to benchmark and explore edge cases.
>
Thanks for the review.
Fair enough, I'll try many more cases in the upcoming weeks to make sure
we're not missing anything.
>
> --
> nathan
Regards,
Ayoub
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2026-04-02 18:08:18 | Re: pg_waldump: support decoding of WAL inside tarfile |
| Previous Message | Rafia Sabih | 2026-04-02 18:00:25 | Re: Bypassing cursors in postgres_fdw to enable parallel plans |