Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Date: 2025-10-30 19:10:42
Message-ID: CAD21AoAd8smQkJWWoghBZUAQT5xk=C6mXoN99sOWo4wEdg+uLQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 29, 2025 at 5:19 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
>
>
> > On 28 Oct 2025, at 22:44, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Andrey has shared his patch for base32hex support before[1]. While it
> > needs to be updated, it seems to implement sufficient function.
>
> I'd propose something like attached patch. It's on top of Ilmari's v2 patch with small suggestions as a step 2.
>

I've reviewed the v3 patches, and here are some review comments.

v3-0001 and v3-0002:

- errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
- errmsg("invalid uuid length"));
+ (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+ errmsg("invalid length for UUID"),
+ errdetail("Expected %d bytes, got
%d.", UUID_LEN, len)));

How about the error message like "invalid input length for type uuid"?
I think "uuid" should be lower case as it indicates PostgreSQL uuid
data type, and it's better to use %s format instead of directly
writing "uuid" (see string_to_uuid() for example).

As for the errdetail message, should we add "bytea" also after "got %d"?

---
+-- casts
+SELECT '5b35380a-7143-4912-9b55-f322699c6770'::uuid::bytea;
+SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
+SELECT '\x1234567890abcdef'::bytea::uuid; -- error

We already have tests for casting bytes to integer data types in
strings.sql. I suggest moving the casting tests from bytea to uuid
into therel. For the uuid.sql file, we could add a test to verify that
a UUID value remains unchanged when it's cast to bytea and back to
UUID. For example,

SELECT v = v::bytea::uuid as matched FROM gen_random_uuid() v;

---
I think we should update the documentation in the uuid section about
casting data between bytea and uuid. For references, we have a similar
description for bytea and integer[1].

v3-0003:

base32hex_encode() doesn't seem to add '=' paddings, but is it
intentional? I don't see any description in RFC 4648 that we can omit
'=' paddings.

---
I think the patch should add tests not only for uuid data type but
also for general cases like other encodings.

---
In uuid.sql tests, how about adding some tests to check if base32hex
maintains the sortability of UUIDv7 data?

---
I would suggest registering the patches to the next commit fest if not yet.

Regards,

[1] https://www.postgresql.org/docs/devel/functions-binarystring.html

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-10-30 19:57:39 Re: apply_scanjoin_target_to_paths and partitionwise join
Previous Message Andres Freund 2025-10-30 17:26:32 Re: Update Windows CI Task Names: Server 2022 + VS 2022 Upgrade