| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
| Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Optimize UUID parse using SIMD |
| Date: | 2026-06-29 21:54:36 |
| Message-ID: | CAD21AoDVnaJ8bGftK0VdGdiG1rZGuLop5MmyfW9Duwu_GJgssw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Jun 25, 2026 at 12:30 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Thu, Jun 25, 2026 at 11:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Since commit ec8719ccbfcd made hex_decode_safe() SIMD-aware, decoding
> > a run of hex digits is now fast. The attached patch reuses
> > hex_decode_safe() in the UUID input function to speed up parsing.
> >
> > We accept several textual forms of a UUID[1]. The fast path handles
> > the common ones: 32 hex digits, the canonical 8x-4x-4x-4x-12x form
> > (where "nx" means n hex digits), and either of those wrapped in
> > braces. Otherwise, it falls back to the ordinary scalar UUID parse.
> >
> > I've benchmarked the parse speed using the following query:
> >
> > CREATE TEMP TABLE u AS SELECT gen_random_uuid()::text AS t FROM
> > generate_series(1, 1000000);
> > EXPLAIN (ANALYZE, TIMING OFF) SELECT t::uuid FROM u;
> >
> > I compared the execution time of the second query, which measures
> > uuid_in() alone, with/without SIMD optimization. Here are results (the
> > median of 5 runs):
> >
> > HEAD: 208.879 ms
> > Patched: 40.983 ms
>
> Nice!
>
> > The improvements look promising to me. But in a realistic pipeline the
> > parse is a small fraction of the work, so end-to-end gains could be
> > much smaller.
> >
> > Feedback is very welcome.
>
> I had a quick look at the patch. It mostly looks good to me. I like
> the idea of falling back to the scalar path when an error occurs - a
> neat user experience.
>
> I think it's not worth adding a test case for this because I believe
> this code gets covered anyway from existing tests.
>
> A few comments:
>
> 1/
> + * pass the local esctx instead of escontext to hex_decode_safe() to
>
> Instead of using variable names in the comments, let's say something
> like: we pass a separate error context to detect errors in the SIMD
> path and fall back to the normal path instead of raising ERRORs, for a
> better user experience.
Agreed.
>
> 2/
> + if (esctx.error_occurred)
> + string_to_uuid_scalar(source, uuid, escontext);
>
> An error on supported platforms seems rare, but when one occurs, I
> think it's worth emitting a WARNING or LOG message. This way the query
> succeeds, but later in the server logs, if noticed, it could provide
> useful reasoning or uncover issues in the SIMD code.
Hmm, I'm concerned that emitting a WARNING or LOG message whenever
parsing uncommon UUID forms might be annoying rather useful.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2026-06-29 21:55:04 | Re: Optimize UUID parse using SIMD |
| Previous Message | Zsolt Parragi | 2026-06-29 21:49:49 | Re: implement CAST(expr AS type FORMAT 'template') |