Re: Optimize UUID parse using SIMD

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize UUID parse using SIMD
Date: 2026-06-29 21:54:36
Message-ID: CAD21AoDVnaJ8bGftK0VdGdiG1rZGuLop5MmyfW9Duwu_GJgssw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 25, 2026 at 12:30 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Thu, Jun 25, 2026 at 11:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Since commit ec8719ccbfcd made hex_decode_safe() SIMD-aware, decoding
> > a run of hex digits is now fast. The attached patch reuses
> > hex_decode_safe() in the UUID input function to speed up parsing.
> >
> > We accept several textual forms of a UUID[1]. The fast path handles
> > the common ones: 32 hex digits, the canonical 8x-4x-4x-4x-12x form
> > (where "nx" means n hex digits), and either of those wrapped in
> > braces. Otherwise, it falls back to the ordinary scalar UUID parse.
> >
> > I've benchmarked the parse speed using the following query:
> >
> > CREATE TEMP TABLE u AS SELECT gen_random_uuid()::text AS t FROM
> > generate_series(1, 1000000);
> > EXPLAIN (ANALYZE, TIMING OFF) SELECT t::uuid FROM u;
> >
> > I compared the execution time of the second query, which measures
> > uuid_in() alone, with/without SIMD optimization. Here are results (the
> > median of 5 runs):
> >
> > HEAD: 208.879 ms
> > Patched: 40.983 ms
>
> Nice!
>
> > The improvements look promising to me. But in a realistic pipeline the
> > parse is a small fraction of the work, so end-to-end gains could be
> > much smaller.
> >
> > Feedback is very welcome.
>
> I had a quick look at the patch. It mostly looks good to me. I like
> the idea of falling back to the scalar path when an error occurs - a
> neat user experience.
>
> I think it's not worth adding a test case for this because I believe
> this code gets covered anyway from existing tests.
>
> A few comments:
>
> 1/
> + * pass the local esctx instead of escontext to hex_decode_safe() to
>
> Instead of using variable names in the comments, let's say something
> like: we pass a separate error context to detect errors in the SIMD
> path and fall back to the normal path instead of raising ERRORs, for a
> better user experience.

Agreed.

>
> 2/
> + if (esctx.error_occurred)
> + string_to_uuid_scalar(source, uuid, escontext);
>
> An error on supported platforms seems rare, but when one occurs, I
> think it's worth emitting a WARNING or LOG message. This way the query
> succeeds, but later in the server logs, if noticed, it could provide
> useful reasoning or uncover issues in the SIMD code.

Hmm, I'm concerned that emitting a WARNING or LOG message whenever
parsing uncommon UUID forms might be annoying rather useful.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2026-06-29 21:55:04 Re: Optimize UUID parse using SIMD
Previous Message Zsolt Parragi 2026-06-29 21:49:49 Re: implement CAST(expr AS type FORMAT 'template')