Re: UUID v7

From: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
To: Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>
Cc: pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>, Nikolay Samokhvalov <nik(at)postgres(dot)ai>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Nick Babadzhanian <pgnickb(at)gmail(dot)com>, Mat Arye <mat(at)timescaledb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, "Kyzer Davis (kydavis)" <kydavis(at)cisco(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "brad(at)peabody(dot)io" <brad(at)peabody(dot)io>, Kirk Wolak <wolakk(at)gmail(dot)com>
Subject: Re: UUID v7
Date: 2024-01-29 18:32:38
Message-ID: 5932FFF8-4E5F-44DC-8FEA-22E8302759C7@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 25 Jan 2024, at 22:04, Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au> wrote:
>
> Aleksander,
>
> In this case the documentation must state that the functions uuid_extract_time() and uuidv7(T) are against the RFC requirements, and that developers may use these functions with caution at their own risk, and these functions are not recommended for production environment.

Refining documentation is good. However, saying that these functions are not recommended for production must be based on some real threats.

>
> The function uuidv7(T) is not better than uuid_extract_time(). Careless developers may well pass any business date into this function: document date, registration date, payment date, reporting date, start date of the current month, data download date, and even a constant. This would be a profanation of UUIDv7 with very negative consequences.

Even if the developer pass constant time to uuidv7(T) they will get what they asked for - unique identifier. Moreover - it still will be keeping locality. There will be no negative consequences at all.
On the contrary, experienced developer can leverage parameter when data locality should be reduced. If you have serveral streams of data, you might want to introduce some shift in reduce contention.
For example, you can generate uuidv7(now() + '1 day' * random(0,10)). This will split 1 contention point to 10 and increase ingestion performance 10x-fold.

> On 29 Jan 2024, at 18:58, Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
>
> If other timestamp sources or
> a custom timestamp epoch are required, UUIDv8 MUST be used.

Well, yeah. RFC says this... in 4 capital letters :) I believe it's kind of a big deficiency that k-way sortable identifiers are not implementable on top of UUIDv7. Well, let's go without this function. UUIDv7 is still an improvement over previous versions.

Jelte, your documentation corrections looks good to me, I'll include them in next version.

Thanks!

Best regards, Andrey Borodin.

In response to

  • Re: UUID v7 at 2024-01-25 17:04:05 from Sergey Prokhorenko

Responses

  • Re: UUID v7 at 2024-01-29 20:38:27 from Jelte Fennema-Nio
  • Re: UUID v7 at 2024-01-30 00:27:21 from Sergey Prokhorenko

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2024-01-29 18:35:52 Re: Schema variables - new implementation for Postgres 15
Previous Message Nathan Bossart 2024-01-29 18:21:30 Re: cleanup patches for incremental backup