| From: | KAZAR Ayoub <ma_kazar(at)esi(dot)dz> |
|---|---|
| To: | Mark Wong <markwkm(at)gmail(dot)com>, Neil Conway <neil(dot)conway(at)gmail(dot)com> |
| Cc: | Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Speed up COPY FROM text/CSV parsing using SIMD |
| Date: | 2026-02-02 20:45:04 |
| Message-ID: | CA+K2RunBvVQGPvxGLBf60YV_6b2goeG3Y9+CS6XVaxaJyWCNGw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello Mark,
On Fri, Jan 30, 2026 at 11:05 PM Mark Wong <markwkm(at)gmail(dot)com> wrote:
> On Tue, Jan 13, 2026 at 06:20:27PM -0600, Manni Wood wrote:
> > On Tue, Jan 13, 2026 at 1:12 PM Mark Wong <markwkm(at)gmail(dot)com> wrote:
> >
> > > On Fri, Jan 09, 2026 at 05:21:45PM +0300, Nazir Bilal Yavuz wrote:
> > > > Were you able to understand why Mark's benchmark results are
> different
> > > > from ours?
> > >
> > > Not yet... I had some guesses, which is why I suggested the processor
> > > pinning
> > > and using a ramdisk. But we haven't tried applying all of those to my
> > > laptop,
> > > which has 3 core types, or the POWER system, which may be interesting
> to
> > > use a
> > > ram disk on.
> > >
> > > I'm curious though, and admittedly haven't tried looking myself yet,
> about
> > > how
> > > the SIMD calls might look across different processor architectures.
> We'll
> > > try
> > > to get that on the POWER system soon...
> > >
> > > Regards,
> > > Mark
> >
> > Hello!
> >
> > Nazir, I'm glad you are finding the benchmarks useful. I have more! :-)
> >
> > All of these benchmarks are all-in-RAM, because I do think that is the
> best
> > way of getting closest to the theoretical best and worst case scenarios.
> >
> > My laptop:
> >
> > master: (852558b9)
> >
> > text, no special: 14996
> > text, 1/3 special: 17270
> > csv, no special: 18274
> > csv, 1/3 special: 23852
> >
> > v3
> >
> > text, no special: 11282 (24.7% speedup)
> > text, 1/3 special: 15748 (8.8% speedup) <-- I don't believe this but it's
> > what I got
> > csv, no special: 11571 (36.6% speedup)
> > csv, 1/3 special: 19934 (16.4% speedup) <-- I don't believe this but it's
> > what I got
> >
> > v4.2
> >
> > text, no special: 11139 (25.7% speedup)
> > text, 1/3 special: 18900 (9.4% regression)
> > csv, no special: 11490 (37.1% speedup)
> > csv, 1/3 special: 26134 (9.5% regression)
> >
> > An AWS EC2 t2.2xlarge instance
> >
> > master: (852558b9)
> >
> > text, no special: 20677
> > text, 1/3 special: 22660
> > csv, no special: 24534
> > csv, 1/3 special: 30999
> >
> > v3
> >
> > text, no special: 17534 (15.2% speedup)
> > text, 1/3 special: 22816 (0.6% regression)
> > csv, no special: 17664 (28.0% speedup)
> > csv, 1/3 special: 29338 (5.3% speedup) <-- I don't believe this but it's
> > what I got
> >
> > v4.2
> >
> > text, no special: 17459 (15.5% speedup)
> > text, 1/3 special: 25051 (10.5% regression)
> > csv, no special: 17574 (28.3% speedup)
> > csv, 1/3 special: 32092 (3.5% regression)
> >
> > An AWS EC2 t4g.2xlarge instance (using ARM processor; first test of ARM
> > processor!)
> >
> > master: (852558b9)
> >
> > text, no special: 22081
> > text, 1/3 special: 25100
> > csv, no special: 27296
> > csv, 1/3 special: 32344
> >
> > v3
> >
> > text, no special: 17724 (19.7% speedup)
> > text, 1/3 special: 27606 (9.9% regression) <-- yikes! We would want to
> test
> > this more
> > csv, no special: 17597 (35.5% speedup)
> > csv, 1/3 special: 32597 (0.8% regression)
> >
> > v4.2
> >
> > text, no special: 17674 (20% speedup)
> > text, 1/3 special: 25773 (2.6% regression) <-- this regression is less
> than
> > for the v3 patch? Atypical...
> > csv, no special: 17651 (35.3% speedup)
> > csv, 1/3 special: 34055 (5.3% regression)
>
> I'm still lagging behind a little I ran the v4.2 patches again applied to
> 71c11369 on the POWER system that I have access to, using Manni's
> copysimdperf
> scripts to use a ramdisk and processor pinning.
>
> text, no special: -2508 (30% speedup)
> text, 1/3 special: -1753 (48% speedup)
> csv, no special: 9264 (3% regression)
> csv, 1/3 special: -4077 (0.3% speedup)
>
> Thanks for the benchmark, I'm a bit suspicious about this because I find
it illogical or at least highly unexpected for a 1/3 specials workload to
perform better than no specials !
or csv no special regressing ! because it's expected to take the simd path
for the whole processing, so it's supposed to perform better than master
(at least ...).
I wonder what the results look like for COPY TO case on POWER. If you can
try, that case is at least even more theoretically predictable.
Regards,
Ayoub
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2026-02-02 20:52:43 | Re: getting "shell command argument contains a newline or carriage return:" error with pg_dumpall when db name have new line in double quote |
| Previous Message | Zsolt Parragi | 2026-02-02 20:33:06 | Re: Use correct collation in pg_trgm |