Re: Sequence Access Methods, round two

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Sequence Access Methods, round two
Date: 2024-01-18 23:27:23
Message-ID: Zamz28IYDtidWsCG@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 18, 2024 at 04:54:06PM +0100, Matthias van de Meent wrote:
> On Thu, 18 Jan 2024, 16:06 Peter Eisentraut, <peter(at)eisentraut(dot)org> wrote:
>> On 01.12.23 06:00, Michael Paquier wrote:
>>> Please find attached a patch set that aims at implementing sequence
>>> access methods, with callbacks following a model close to table and
>>> index AMs, with a few cases in mind:
>>> - Global sequences (including range-allocation, local caching).
>>> - Local custom computations (a-la-snowflake).
>>
>> That's a lot of code, but the use cases are summarized in two lines?!?
>>
>> I would like to see a lot more elaboration what these uses cases are (I
>> recognize the words, but do we have the same interpretation of them?)
>> and how they would be addressed by what you are proposing, and better
>> yet an actual implementation of something useful, rather than just a
>> dummy test module.
>
> At $prevjob we had a use case for PRNG to generate small,
> non-sequential "random" numbers without the birthday problem occurring
> in sqrt(option space) because that'd increase the printed length of
> the numbers beyond a set limit. The sequence API proposed here
> would've been a great alternative to the solution we found, as it
> would allow a sequence to be backed by an Linear Congruential
> Generator directly, rather than the implementation of our own
> transactional random_sequence table.

Interesting.

Yes, one of the advantages of this API layer is that all the
computation is hidden behind a sequence object at the PostgreSQL
level, hence applications just need to set a GUC to select a given
computation method *while* still using the same DDLs from their
application, or just append USING to their CREATE SEQUENCE but I've
heard that applications would just do the former and forget about it.

The reason why this stuff has bumped into my desk is that we have no
good solution in-core for globally-distributed transactions for
active-active deployments. First, anything we have needs to be
plugged into default expressions of attributes like with [1] or [2],
or a tweak is to use sequence values that are computed with different
increments to avoid value overlaps across nodes. Both of these
require application changes, which is meh for a bunch of users. The
second approach with integer-based values can be become particularly a
pain if one has to fix value conflicts across nodes as they'd usually
require extra tweaks with the sequence definitions, especially if it
blocks applications in the middle of the night. Sequence AMs offer
more control on that. For example, snowflake IDs can rely on a GUC to
set a specific machine ID to force some of the bits of a 64-bit
integer to be the same for a single node in an active-active
deployment, ensuring that any value computed across *all* the nodes of
a cluster are always unique, while being maintained behind a sequence
object in-core. (I can post a module to demonstrate that based on the
sequence AM APIs, just wait a min.. Having more than a test module
and/or a contrib is a separate discussion.)

By the way, patches 0001 to 0004 are just refactoring pieces.
Particularly, 0001 redesigns pg_sequence_last_value() to work across
the board for upgrades and dumps, while avoiding a scan of the
sequence "heap" relation in pg_dump. These are improvements for the
core code in any case.

[1]: https://github.com/pgEdge/snowflake
[2]: https://www.postgresql.org/message-id/TY3PR01MB988983D23E4F1DA10567BC5BF5B9A%40TY3PR01MB9889.jpnprd01.prod.outlook.com
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirk Wolak 2024-01-19 00:50:27 Re: Oom on temp (un-analyzed table caused by JIT) V16.1 [Fixed Already]
Previous Message Anton Kirilov 2024-01-18 23:11:22 Re: Add PQsendSyncMessage() to libpq