Re: Speed up COPY FROM text/CSV parsing using SIMD

From: KAZAR Ayoub <ma_kazar(at)esi(dot)dz>
To: Mark Wong <markwkm(at)gmail(dot)com>, Neil Conway <neil(dot)conway(at)gmail(dot)com>
Cc: Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: 2026-02-02 20:45:04
Message-ID: CA+K2RunBvVQGPvxGLBf60YV_6b2goeG3Y9+CS6XVaxaJyWCNGw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Mark,

On Fri, Jan 30, 2026 at 11:05 PM Mark Wong <markwkm(at)gmail(dot)com> wrote:

> On Tue, Jan 13, 2026 at 06:20:27PM -0600, Manni Wood wrote:
> > On Tue, Jan 13, 2026 at 1:12 PM Mark Wong <markwkm(at)gmail(dot)com> wrote:
> >
> > > On Fri, Jan 09, 2026 at 05:21:45PM +0300, Nazir Bilal Yavuz wrote:
> > > > Were you able to understand why Mark's benchmark results are
> different
> > > > from ours?
> > >
> > > Not yet... I had some guesses, which is why I suggested the processor
> > > pinning
> > > and using a ramdisk. But we haven't tried applying all of those to my
> > > laptop,
> > > which has 3 core types, or the POWER system, which may be interesting
> to
> > > use a
> > > ram disk on.
> > >
> > > I'm curious though, and admittedly haven't tried looking myself yet,
> about
> > > how
> > > the SIMD calls might look across different processor architectures.
> We'll
> > > try
> > > to get that on the POWER system soon...
> > >
> > > Regards,
> > > Mark
> >
> > Hello!
> >
> > Nazir, I'm glad you are finding the benchmarks useful. I have more! :-)
> >
> > All of these benchmarks are all-in-RAM, because I do think that is the
> best
> > way of getting closest to the theoretical best and worst case scenarios.
> >
> > My laptop:
> >
> > master: (852558b9)
> >
> > text, no special: 14996
> > text, 1/3 special: 17270
> > csv, no special: 18274
> > csv, 1/3 special: 23852
> >
> > v3
> >
> > text, no special: 11282 (24.7% speedup)
> > text, 1/3 special: 15748 (8.8% speedup) <-- I don't believe this but it's
> > what I got
> > csv, no special: 11571 (36.6% speedup)
> > csv, 1/3 special: 19934 (16.4% speedup) <-- I don't believe this but it's
> > what I got
> >
> > v4.2
> >
> > text, no special: 11139 (25.7% speedup)
> > text, 1/3 special: 18900 (9.4% regression)
> > csv, no special: 11490 (37.1% speedup)
> > csv, 1/3 special: 26134 (9.5% regression)
> >
> > An AWS EC2 t2.2xlarge instance
> >
> > master: (852558b9)
> >
> > text, no special: 20677
> > text, 1/3 special: 22660
> > csv, no special: 24534
> > csv, 1/3 special: 30999
> >
> > v3
> >
> > text, no special: 17534 (15.2% speedup)
> > text, 1/3 special: 22816 (0.6% regression)
> > csv, no special: 17664 (28.0% speedup)
> > csv, 1/3 special: 29338 (5.3% speedup) <-- I don't believe this but it's
> > what I got
> >
> > v4.2
> >
> > text, no special: 17459 (15.5% speedup)
> > text, 1/3 special: 25051 (10.5% regression)
> > csv, no special: 17574 (28.3% speedup)
> > csv, 1/3 special: 32092 (3.5% regression)
> >
> > An AWS EC2 t4g.2xlarge instance (using ARM processor; first test of ARM
> > processor!)
> >
> > master: (852558b9)
> >
> > text, no special: 22081
> > text, 1/3 special: 25100
> > csv, no special: 27296
> > csv, 1/3 special: 32344
> >
> > v3
> >
> > text, no special: 17724 (19.7% speedup)
> > text, 1/3 special: 27606 (9.9% regression) <-- yikes! We would want to
> test
> > this more
> > csv, no special: 17597 (35.5% speedup)
> > csv, 1/3 special: 32597 (0.8% regression)
> >
> > v4.2
> >
> > text, no special: 17674 (20% speedup)
> > text, 1/3 special: 25773 (2.6% regression) <-- this regression is less
> than
> > for the v3 patch? Atypical...
> > csv, no special: 17651 (35.3% speedup)
> > csv, 1/3 special: 34055 (5.3% regression)
>
> I'm still lagging behind a little I ran the v4.2 patches again applied to
> 71c11369 on the POWER system that I have access to, using Manni's
> copysimdperf
> scripts to use a ramdisk and processor pinning.
>
> text, no special: -2508 (30% speedup)
> text, 1/3 special: -1753 (48% speedup)
> csv, no special: 9264 (3% regression)
> csv, 1/3 special: -4077 (0.3% speedup)
>
> Thanks for the benchmark, I'm a bit suspicious about this because I find
it illogical or at least highly unexpected for a 1/3 specials workload to
perform better than no specials !
or csv no special regressing ! because it's expected to take the simd path
for the whole processing, so it's supposed to perform better than master
(at least ...).

I wonder what the results look like for COPY TO case on POWER. If you can
try, that case is at least even more theoretically predictable.

Regards,
Ayoub

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2026-02-02 20:52:43 Re: getting "shell command argument contains a newline or carriage return:" error with pg_dumpall when db name have new line in double quote
Previous Message Zsolt Parragi 2026-02-02 20:33:06 Re: Use correct collation in pg_trgm