Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions

From: Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Date: 2025-10-24 22:39:45
Message-ID: 953203149.383019.1761345585325@mail.yahoo.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday 24 October 2025 at 09:24:15 pm GMT+3, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:

On Fri, Oct 24, 2025 at 12:17 AM Sergey Prokhorenko
<sergeyprokhorenko(at)yahoo(dot)com(dot)au> wrote:
>
>
> Masahiko,
>
> Developers will still be able to use the long canonical 'hex' UUID format for compatibility. But the short format is not a developer choice, but a convention. We mustn't allow a situation where 25% of systems use base32hex, 25% use Crocksford's Base32, 25% use base36, and 25% even use erroneously sorted base64. That's a very real nightmare. You, too, have every reason not to want to increase the number of built-in functions in PostgreSQL.
>
> But here is a solution that I hope will satisfy everyone:
>
> encode('019535d9-3df7-79fb-b466-fa907fa17f9e', 'uuid_to_base32hex') -> 06AJBM9TUTSVND36VA87V8BVJO

> Does it mean the first argument is uuid type data and when
> 'uuid_to_base32hex' is specified as the format the function requires a
> uuid data at the first argument?

Yes, that's right.PostgreSQL will automatically cast the string '019535d9-3df7-79fb-b466-fa907fa17f9e' to the uuid type, since the format is correct.

> I could not understand the difference
> between specifying 'based32hex' and 'uuid_to_base32hex' when encoding
> UUID data with base32hex encoding.

1. Specifying 'based32hex' in encode function means the first parameter is of bytea type as usual.

But specifying 'uuid_to_base32hex' means the first parameter is of uuid type.

2. The encode function does not yet support format based32hex. Therefore, it is not known whether padding ===== should be used.
But padding ===== is not used when specifying 'uuid_to_base32hex'.

> decode('06AJBM9TUTSVND36VA87V8BVJO', 'base32hex_to_uuid') -> 019535d9-3df7-79fb-b466-fa907fa17f9e

> Suppose that the decode() takes text data at the first argument and
> returns UUID data, the function signature would be decode(text, text)
> -> uuid. But we cannot create two functions that have the same name
> and the same argument types.
Yes, you're right. This is a problem that can't be solved without composite return values. We clearly took the wrong approach by coupling UUID conversion with encode/decode functions, which only apply to bytea. UUID and bytea are fundamentally different data types. Meanwhile, PostgreSQL has over 30 other type conversion functions that deal with other data types. For example, array_to_string, string_to_array, jsonb_to_record, to_char, to_timestamp, and to_hex. In this situation, the best solution would be to revert to the original uuid_to_base32hex() and base32hex_to_uuid() functions rather than deal with type incompatibility issues.

> Regards,

> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-10-24 23:31:25 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Previous Message Masahiko Sawada 2025-10-24 22:35:58 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart