Re: Optimize UUID parse using SIMD

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize UUID parse using SIMD
Date: 2026-06-25 19:30:09
Message-ID: CALj2ACUj8mvpC22c6QtwMapYVX5QeXuQU_9T7C6iqQcUQshqag@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Jun 25, 2026 at 11:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> Since commit ec8719ccbfcd made hex_decode_safe() SIMD-aware, decoding
> a run of hex digits is now fast. The attached patch reuses
> hex_decode_safe() in the UUID input function to speed up parsing.
>
> We accept several textual forms of a UUID[1]. The fast path handles
> the common ones: 32 hex digits, the canonical 8x-4x-4x-4x-12x form
> (where "nx" means n hex digits), and either of those wrapped in
> braces. Otherwise, it falls back to the ordinary scalar UUID parse.
>
> I've benchmarked the parse speed using the following query:
>
> CREATE TEMP TABLE u AS SELECT gen_random_uuid()::text AS t FROM
> generate_series(1, 1000000);
> EXPLAIN (ANALYZE, TIMING OFF) SELECT t::uuid FROM u;
>
> I compared the execution time of the second query, which measures
> uuid_in() alone, with/without SIMD optimization. Here are results (the
> median of 5 runs):
>
> HEAD: 208.879 ms
> Patched: 40.983 ms

Nice!

> The improvements look promising to me. But in a realistic pipeline the
> parse is a small fraction of the work, so end-to-end gains could be
> much smaller.
>
> Feedback is very welcome.

I had a quick look at the patch. It mostly looks good to me. I like
the idea of falling back to the scalar path when an error occurs - a
neat user experience.

I think it's not worth adding a test case for this because I believe
this code gets covered anyway from existing tests.

A few comments:

1/
+ * pass the local esctx instead of escontext to hex_decode_safe() to

Instead of using variable names in the comments, let's say something
like: we pass a separate error context to detect errors in the SIMD
path and fall back to the normal path instead of raising ERRORs, for a
better user experience.

2/
+ if (esctx.error_occurred)
+ string_to_uuid_scalar(source, uuid, escontext);

An error on supported platforms seems rare, but when one occurs, I
think it's worth emitting a WARNING or LOG message. This way the query
succeeds, but later in the server logs, if noticed, it could provide
useful reasoning or uncover issues in the SIMD code.

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Nestorov 2026-06-25 20:07:00 Re: [PATCH] btree_gist: add cross-type integer operator support for GiST
Previous Message Bharath Rupireddy 2026-06-25 18:41:49 Re: Add logical_decoding_spill_limit to cap spill file disk usage per slot