| From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Optimize UUID parse using SIMD |
| Date: | 2026-06-25 19:30:09 |
| Message-ID: | CALj2ACUj8mvpC22c6QtwMapYVX5QeXuQU_9T7C6iqQcUQshqag@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Thu, Jun 25, 2026 at 11:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> Since commit ec8719ccbfcd made hex_decode_safe() SIMD-aware, decoding
> a run of hex digits is now fast. The attached patch reuses
> hex_decode_safe() in the UUID input function to speed up parsing.
>
> We accept several textual forms of a UUID[1]. The fast path handles
> the common ones: 32 hex digits, the canonical 8x-4x-4x-4x-12x form
> (where "nx" means n hex digits), and either of those wrapped in
> braces. Otherwise, it falls back to the ordinary scalar UUID parse.
>
> I've benchmarked the parse speed using the following query:
>
> CREATE TEMP TABLE u AS SELECT gen_random_uuid()::text AS t FROM
> generate_series(1, 1000000);
> EXPLAIN (ANALYZE, TIMING OFF) SELECT t::uuid FROM u;
>
> I compared the execution time of the second query, which measures
> uuid_in() alone, with/without SIMD optimization. Here are results (the
> median of 5 runs):
>
> HEAD: 208.879 ms
> Patched: 40.983 ms
Nice!
> The improvements look promising to me. But in a realistic pipeline the
> parse is a small fraction of the work, so end-to-end gains could be
> much smaller.
>
> Feedback is very welcome.
I had a quick look at the patch. It mostly looks good to me. I like
the idea of falling back to the scalar path when an error occurs - a
neat user experience.
I think it's not worth adding a test case for this because I believe
this code gets covered anyway from existing tests.
A few comments:
1/
+ * pass the local esctx instead of escontext to hex_decode_safe() to
Instead of using variable names in the comments, let's say something
like: we pass a separate error context to detect errors in the SIMD
path and fall back to the normal path instead of raising ERRORs, for a
better user experience.
2/
+ if (esctx.error_occurred)
+ string_to_uuid_scalar(source, uuid, escontext);
An error on supported platforms seems rare, but when one occurs, I
think it's worth emitting a WARNING or LOG message. This way the query
succeeds, but later in the server logs, if noticed, it could provide
useful reasoning or uncover issues in the SIMD code.
--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexander Nestorov | 2026-06-25 20:07:00 | Re: [PATCH] btree_gist: add cross-type integer operator support for GiST |
| Previous Message | Bharath Rupireddy | 2026-06-25 18:41:49 | Re: Add logical_decoding_spill_limit to cap spill file disk usage per slot |