Re: Could postgres12 support millions of sequences? (like 10 million)

From: pabloa98 <pabloa98(at)gmail(dot)com>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Rob Sargent <robjsargent(at)gmail(dot)com>, "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Could postgres12 support millions of sequences? (like 10 million)
Date: 2020-03-23 03:55:43
Message-ID: CAEjudX4Y-ZvdGvpag9-fxaV3oQwAYxiwAxoibYVXLbPKt410pw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Mar 22, 2020 at 5:36 PM Christopher Browne <cbbrowne(at)gmail(dot)com>
wrote:

> On Sun, 22 Mar 2020 at 17:54, pabloa98 <pabloa98(at)gmail(dot)com> wrote:
>
>>
>> So the question may actually be:
>>>
>>> How do we improve our locking code, so we don't have to spawn millions
>>> of sequences?
>>>
>>> What is the locking method you are using?
>>>
>>
>> I am not using locking with the million sequence solution. I do not want
>> something that locks because the problems described below
>>
>> I prefer the solution generates a gap (skip a couple of numbers) and not
>> using locks.
>>
>
> If you can cope with gaps, that's a good thing. That means that some kind
> of usage of sequences would be compatible with your application.
>

Indeed

>
>
>>
>>> > The lock part is because we solved a similar problem with a counter by
>>> > row locking the counter and increasing it in another part of the
>>> > database. The result is that all the queries using that table are
>>> queued
>>> > by pair (group, element) that is not that bad because we are not
>>> > inserting thousands of rows by second. Still is killing cluster
>>> > performance (but performance is still OK from the business point of
>>> > view). The problem using locks is that they are too sensitive to
>>> > developer errors and bugs. Sometimes connected clients aborts and the
>>> > connection is returned to the pool with the lock active until the
>>> > connection is closed or someone unlocks the row. I would prefer to
>>> have
>>> > something more resilient to developers/programming errors, if possible.
>>> >
>>>
>>> Now I read this paragraph, I realize I was not clear enough.
>> I am saying we do not want to use locks because of all the problems
>> described.
>>
>
> Cool, that means you have been thinking through similar considerations to
> what others have in mind, and it doesn't sound like there are dramatically
> different understandings.
>
> Let's circle back to the schema that you provided...
>
> CREATE TABLE counter(
> group INT NOT NULL,
> element INT NOT NULL,
> seq_number INT NOT NULL default 0,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> That's not a table using the sequential values; that's what you imagining
> you could have as a way of referencing the sequences, right?
>
> I would represent it slightly differently...
>
> create table counter (
> group integer not null,
> element integer not null,
> sequence_name name,
> primary key (group, element)
> );
> Arguably, there's no need for sequence_name altogether, as it's never
> directly referenced by anything.
>
> And then have a function that might fire upon creation of new entries in
> this table.
>
> create or replace function generate_sequence (i_group integer, i_element
> integer) returns name
> as $$
> declare
> c_seqname name;
> c_query text;
> begin
> c_seqname := 'obj_counter_' || i_group || '_' || i_element;
> c_query := 'create sequence if not exists ' || c_seqname || ';';
> execute c_query;
> update counter set sequence_name = c_seqname where group = i_group and
> element = i_element;
> return c_seqname;
> end
> $$ language plpgsql;
>
> You'd need a trigger function to put onto the table that runs this
> function; that is left as an exercise for the reader.
>
> Then, on any of the tables where you need to assign sequence values, you'd
> need to run an "after" trigger to do the assignment. The function that
> finds the sequence value is kind of analagous:
> create or replace function get_next_counter (i_group integer, i_element
> integer) returns integer -- or bigint?
> as $$
> declare
> c_seqname name;
> c_query text;
> c_seqval integer;
> begin
> c_seqname := 'obj_counter_' || i_group || '_' || i_element;
> c_query := 'select nextval(' || quote_ident( c_seqname_ || ');';
> execute c_query into c_seqval;
> return c_seqval;
> end
> $$ language plpgsql;
>
> Again, that just grabs a nextval(); you'd need to execute this inside a
> trigger function called ON INSERT on any of the tables that need sequence
> values assigned.
> That encapsulates the usage of this horde of sequences. You're probably
> calling get_next_counter() millions of times, so perhaps that code gets
> expanded directly into place in the trigger function.
>
>
I like this approach.
When I have a medium number of sequence I will report how it behaves. It
will take some time though.

Pablo

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Paquier 2020-03-23 04:57:23 Re: Passwordcheck configuration
Previous Message David G. Johnston 2020-03-23 01:58:39 Re: Could postgres12 support millions of sequences? (like 10 million)