| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> | 
|---|---|
| To: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> | 
| Cc: | Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions | 
| Date: | 2025-10-30 19:10:42 | 
| Message-ID: | CAD21AoAd8smQkJWWoghBZUAQT5xk=C6mXoN99sOWo4wEdg+uLQ@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Wed, Oct 29, 2025 at 5:19 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
>
>
> > On 28 Oct 2025, at 22:44, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Andrey has shared his patch for base32hex support before[1]. While it
> > needs to be updated, it seems to implement sufficient function.
>
> I'd propose something like attached patch. It's on top of Ilmari's v2 patch with small suggestions as a step 2.
>
I've reviewed the v3 patches, and here are some review comments.
v3-0001 and v3-0002:
-                               errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
-                               errmsg("invalid uuid length"));
+                               (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+                                errmsg("invalid length for UUID"),
+                                errdetail("Expected %d bytes, got
%d.", UUID_LEN, len)));
How about the error message like "invalid input length for type uuid"?
I think "uuid" should be lower case as it indicates PostgreSQL uuid
data type, and it's better to use %s format instead of directly
writing "uuid" (see string_to_uuid() for example).
As for the errdetail message, should we add "bytea" also after "got %d"?
---
+-- casts
+SELECT '5b35380a-7143-4912-9b55-f322699c6770'::uuid::bytea;
+SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
+SELECT '\x1234567890abcdef'::bytea::uuid; -- error
We already have tests for casting bytes to integer data types in
strings.sql. I suggest moving the casting tests from bytea to uuid
into therel. For the uuid.sql file, we could add a test to verify that
a UUID value remains unchanged when it's cast to bytea and back to
UUID. For example,
SELECT v = v::bytea::uuid as matched FROM gen_random_uuid() v;
---
I think we should update the documentation in the uuid section about
casting data between bytea and uuid. For references, we have a similar
description for bytea and integer[1].
v3-0003:
base32hex_encode() doesn't seem to add '=' paddings, but is it
intentional? I don't see any description in RFC 4648 that we can omit
'=' paddings.
---
I think the patch should add tests not only for uuid data type but
also for general cases like other encodings.
---
In uuid.sql tests, how about adding some tests to check if base32hex
maintains the sortability of UUIDv7 data?
---
I would suggest registering the patches to the next commit fest if not yet.
Regards,
[1] https://www.postgresql.org/docs/devel/functions-binarystring.html
-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2025-10-30 19:57:39 | Re: apply_scanjoin_target_to_paths and partitionwise join | 
| Previous Message | Andres Freund | 2025-10-30 17:26:32 | Re: Update Windows CI Task Names: Server 2022 + VS 2022 Upgrade |