Re: Could postgres12 support millions of sequences? (like 10 million)

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: pabloa98 <pabloa98(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Lewis <mlewis(at)entrata(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Rob Sargent <robjsargent(at)gmail(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Could postgres12 support millions of sequences? (like 10 million)
Date: 2020-03-20 16:59:51
Message-ID: e9337b4c-cbd7-7ad9-ea01-5af415139ad1@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 3/19/20 10:31 PM, pabloa98 wrote:
> I see.
>
> Any suggestion? It should behave like a sequence in the sense that
> concurrent transitions will get different numbers from this alternative
> sequence like solution.
>
> In our case, we will need to do a call nextval('some_seq') (or similar)
> from different processes no more than twice every minute.
>
>
> It would be nice to have a sequence data type. Correct me if I am wrong,
> but It seems to me that a sequence data type would cost the same or less
> than the current sequence implementation.
>
> The schema would be more clear too. We could have a table like:
>
> CREATE TABLE pair(
> group INT NOT NULL,
> element INT NOT NULL,
> seq SEQUENCE INCREMENT 1
>               START 1
>               CACHE 1
>               MINVALUE 1
>               MAXVALUE 99999999
>         NOT NULL,
> CONSTRAINT PRIMARY KEY (group, element)
> );
>
> And then:
>
> INSERT INTO event(group, element, code)
> VALUES (
>          1,
>          1,
>          ( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE
> p.group=1 and p.code=1 )
> );
>
> Or perhaps storing all the sequences in the same table as rows will have
> the same behavior.

If code is just something to show the sequence of insertion for group,
element combinations then maybe something like below:

CREATE TABLE event(
group_id INT NOT NULL, --changed as group is reserved word
element INT NOT NULL,
insert_ts timestamptz NOT NULL DEFAULT clock_timestamp(),
PRIMARY KEY(group_id, element, insert_ts)
);

insert into event(group_id, element) VALUES
(1, 1),
(1, 1),
(1, 1),
(2, 1),
(1, 1),
(1, 3),
(1, 1),
(1, 3),
(2, 1),
(2, 1);

select * from event ;
group_id | element | insert_ts
----------+---------+--------------------------------
1 | 1 | 03/20/2020 09:51:12.675926 PDT
1 | 1 | 03/20/2020 09:51:12.675985 PDT
1 | 1 | 03/20/2020 09:51:12.675991 PDT
2 | 1 | 03/20/2020 09:51:12.675996 PDT
1 | 1 | 03/20/2020 09:51:12.676 PDT
1 | 3 | 03/20/2020 09:51:12.676004 PDT
1 | 1 | 03/20/2020 09:51:12.676008 PDT
1 | 3 | 03/20/2020 09:51:12.676012 PDT
2 | 1 | 03/20/2020 09:51:12.676016 PDT
2 | 1 | 03/20/2020 09:51:12.67602 PDT
(10 rows)

select group_id, element, row_number() OVER (partition by (group_id,
element) order by (group_id, element)) AS code from event;
group_id | element | code
----------+---------+------
1 | 1 | 1
1 | 1 | 2
1 | 1 | 3
1 | 1 | 4
1 | 1 | 5
1 | 3 | 1
1 | 3 | 2
2 | 1 | 1
2 | 1 | 2
2 | 1 | 3
(10 rows)

>
> Pablo
>
>
> On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>
> Michael Lewis <mlewis(at)entrata(dot)com <mailto:mlewis(at)entrata(dot)com>> writes:
> > On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
> <david(dot)g(dot)johnston(at)gmail(dot)com <mailto:david(dot)g(dot)johnston(at)gmail(dot)com>>
> > wrote:
> >> However, one other consideration with sequences: do you care that
> >> PostgreSQL will cache/pin (i.e., no release) every single
> sequence you
> >> touch for the lifetime of the session? (I do not think DISCARD
> matters here
> >> but I'm just guessing)
>
> > Would you expand on this point or is there someplace specific in the
> > documentation on this?
>
> I think what David is worried about is that a sequence object is a
> one-row table in PG's implementation.  Thus
>
> (1) each sequence requires a dozen or two rows in assorted system
> catalogs (not sure exactly how many offhand).
>
> (2) each sequence uses up 8KB on disk for its table file.
>
> (3) each sequence you actually access within a particular session
> results in creation of relcache and catcache entries in that
> session's local memory.  I'm not sure offhand how big those are
> either, but a few KB per sequence would be a reasonable guess.
>
> (4) each sequence competes for space in the shared-buffer arena,
> since its 8K block has to be swapped into there whenever you try
> to access/increment the sequence's value.
>
> This is all mighty inefficient of course, and there's been talk
> of trying to reduce the per-sequence overhead; but I don't know
> of anyone actively working on that.  As things stand, I think
> having millions of sequences would be quite painful performance-
> wise, especially if your workload were actively hitting a lot
> of them concurrently.  It would work, for some value of "work",
> but it wouldn't perform very well.
>
> Also, as multiple people mentioned already, this still wouldn't
> guarantee gap-free sequences of ID values.
>
>                         regards, tom lane
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2020-03-20 17:26:15 Re: Could postgres12 support millions of sequences? (like 10 million)
Previous Message Ron 2020-03-20 16:46:44 Re: Passwordcheck configuration