Re: UUID v7

From: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, "Kyzer Davis (kydavis)" <kydavis(at)cisco(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrey Borodin <amborodin86(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, "brad(at)peabody(dot)io" <brad(at)peabody(dot)io>, "wolakk(at)gmail(dot)com" <wolakk(at)gmail(dot)com>
Subject: Re: UUID v7
Date: 2023-07-07 12:06:19
Message-ID: D8F6A3ED-9F70-4FBE-A259-AABFD89083C4@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 6 Jul 2023, at 21:38, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
>
> I think it would be reasonable to review this patch now.
+1.

Also, I think we should discuss UUID v8. UUID version 8 provides an RFC-compatible format for experimental or vendor-specific use cases. Revision 1 of IETF draft contained interesting code for v8: almost similar to v7, but with fields for "node ID" and "rolling sequence number".
I think this is reasonable approach, thus I attach implementation of UUID v8 per [0]. But from my point of view this implementation has some flaws.
These two new fields "node ID" and "sequence" are there not for uniqueness, but rather for data locality.
But they are placed at the end, in bytes 14 and 15, after randomly generated numbers.

I think that "sequence" is there to help generate local ascending identifiers when the real time clock do not provide enough resolution. So "sequence" field must be placed after 6 bytes of time-generated identifier.

On a contrary "node ID" must differentiate identifiers generated on different nodes. So it makes sense to place "node ID" before timing. So identifiers generated on different nodes will tend to be in different ranges.
Although, section "6.4. Distributed UUID Generation" states that "node ID" is there to decrease the likelihood of a collision. So my intuition might be wrong here.

Do we want to provide this "vendor-specific" UUID with tweaks for databases? Or should we limit the scope with well defined UUID v7?

Best regards, Andrey Borodin.

[0] https://datatracker.ietf.org/doc/html/draft-ietf-uuidrev-rfc4122bis-01

Attachment Content-Type Size
v2-0001-Implement-UUID-v7-and-v8-as-per-IETF-draft.patch application/octet-stream 7.4 KB

In response to

  • Re: UUID v7 at 2023-07-06 16:38:27 from Peter Eisentraut

Responses

  • RE: UUID v7 at 2023-07-07 13:31:07 from Kyzer Davis (kydavis)
  • Re: UUID v7 at 2023-07-10 16:50:38 from Peter Eisentraut

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-07-07 12:09:08 Re: DROP DATABASE is interruptible
Previous Message Amit Langote 2023-07-07 11:59:32 Re: remaining sql/json patches