| From: | KAZAR Ayoub <ma_kazar(at)esi(dot)dz> |
|---|---|
| To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
| Cc: | Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Speed up COPY FROM text/CSV parsing using SIMD |
| Date: | 2025-11-18 20:42:39 |
| Message-ID: | CA+K2RunaLe7Wi1jSMXnNLgL5Bn17==PgrGEAG53HwXDuWpXdXg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Nov 17, 2025, 11:16 PM Nathan Bossart <nathandbossart(at)gmail(dot)com>
wrote:
> (assuming there is a desire to
> continue with it)?
I'm hoping to start spending more time on it soon.
>
Somethings worth noting for future reference (so someone else wouldn't
waste time thinking about it), previously I tried extra several micro
optimizations inside and around CopyReadLineText:
SIMD alignment*:* Forcing 16-byte aligned buffers so we could use aligned
memory instructions (_mm_load_si128 vs _mm_loadu_si128) provided no
measurable benefit on modern CPUs (there's definitely a thread somewhere
talking about it that i didn't encounter yet). This likely explains why
simd.h exclusively uses unaligned load intrinsics the performance
difference has become negligible since Nehalem processors.
Memory prefetching: Explicit prefetch instructions for the COPY buffer
pipeline (copy_raw_buf, input buffers, etc.) either showed no improvement
or slight regression. Multiple chunks are already within a cache line,
other buffers are too far to prefetch and the next part of the buffer is
easily prefetched, nothing special, so it turns out to be not worth having
more uops.
Instruction-level parallelism: Spreading too many independent vector
operations to increase ILP eventually degrades performance, likely due to
backend saturation observed through perf (execution port and execution
units contention most likely ?)
.....
This simply suggests that further optimization work should focus on the
pipeline as a whole for large benefits (parallel copy[0], maybe ?).
--
Regards,
Ayoub Kazar
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2025-11-18 20:52:44 | Re: GUC thread-safety approaches |
| Previous Message | Jelte Fennema-Nio | 2025-11-18 20:37:43 | Re: GUC thread-safety approaches |