Re: Speed up COPY FROM text/CSV parsing using SIMD

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Manni Wood <manni(dot)wood(at)enterprisedb(dot)com>, KAZAR Ayoub <ma_kazar(at)esi(dot)dz>, Neil Conway <neil(dot)conway(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Date: 2026-03-11 20:42:38
Message-ID: abHTvkeIK37hj9oS@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 11, 2026 at 10:22:18PM +0300, Nazir Bilal Yavuz wrote:
> Here is v14 which is v13-0001 + v13-0002.

Thanks! It's getting close.

> + /*
> + * Temporary variables are used here instead of passing the actual
> + * variables (especially input_buf_ptr) directly to the helper. Taking
> + * the address of a local variable might force the compiler to
> + * allocate it on the stack rather than in a register. Because
> + * input_buf_ptr is used heavily in the hot scalar path below, keeping
> + * it in a register is important for performance.
> + */
> + int temp_input_buf_ptr;
> + bool temp_hit_eof = hit_eof;

A few notes:

* Does using a temporary variable for hit_eof actually make a difference?
AFAICT that's only updated when loading more data.

* Does inlining the function produce the same results?

* Also, I'm curious what the usual benchmarks look like with and without
this hack for the latest patch.

--
nathan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2026-03-11 20:57:00 Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
Previous Message Andrew Dunstan 2026-03-11 20:17:09 Re: alter check constraint enforceability