Re: Efficiently advancing a sequence without risking it going backwards.

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Jeremy Schneider <schneider(at)ardentperf(dot)com>
Cc: Paul McGarry <paul(at)paulmcgarry(dot)com>, PostgreSQL Mailing Lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: Efficiently advancing a sequence without risking it going backwards.
Date: 2020-07-09 21:07:59
Message-ID: CAFNqd5XT=961hETVmzyzHHJeW=wv+KsWXk54TwUCsgX0PP-VRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 9 Jul 2020 at 12:59, Jeremy Schneider <schneider(at)ardentperf(dot)com>
wrote:

>
> > On Jul 6, 2020, at 19:06, Paul McGarry <paul(at)paulmcgarry(dot)com> wrote:
> >
> > I don't think I can use setval(), because it risks making sequences go
> backwards, eg:
> >
> > 1) Check values
> > DB1sequence: 1234
> > DB2sequence: 1233 (1 behind)
> > 2) setval('DB2sequence',1234);
> >
> > but if between (1) and (2) there are 2 nextval(DB2sequence) calls on
> another process, (2) would take the sequence back from 1235 to 1234 and I
> would end up trying to create a duplicate key ID from the sequence.
>
> An ability to “lock” the sequence momentarily would give you the tool you
> need, but I don’t think it’s there.
>
> Total hack, but if your application or users can retry when the rare error
> is encountered then one idea is to rename the sequence momentarily while
> you do the setval() then rename it back. Do an initial check without
> renaming, then re-check after renaming and before the setval() call.
>
> If you put retry logic into your application then make sure to include
> back-off logic so you don’t get an outage induced by thundering herd.
>

This is increasingly looking like a set of attempts to intentionally abuse
what sequences were designed for.

The use-case where you need a lock on the value so that there can't
possibly be a hole in the sequence points at the notion of having some
other kind of a function that takes out a lock on a table, and serially
gives out "MAX+1" as the next value.

That isn't a very difficult function to write; the problem with it is that
that sort of function will forcibly serialize all inserts through the
function+table lock that is giving out "MAX+1" values. That's going to be
WAY slower than using a sequence object, and about 98% of the time, people
will prefer the sequence object, particularly because it's about 98% faster.

I'm not quite sure if anyone has put out there a standard-ish idiom for
this; that seems like a not TOO difficult "exercise for the user."

There will definitely be more failure cases, and *wildly* more fighting, in
a concurrent environment, over tuple locks.

- An obvious failure is that if one connection asks for the new MAX+1, gets
it, and then the transaction fails, for some later, out-of-relevant-scope,
reason, you'll still potentially get some "holes" in the series of values.

- If there are 10 connections trying to get MAX+1 concurrently, only one
can get it at a time, and that connection can't relinquish the lock until
its transaction has completed, and the 9 must wait, regardless of how much
work the "winner" still has to do.

These are amongst the reasons why people conclude they *don't* want that
kind of functionality.

It makes me think that the problem needs to be taken back to that initial
point of "I think I need some somewhat coordinated sequences", and poke at
what the *real* requirement is there, and why someone thinks that the
values should be "somewhat coordinated." Something seems off there.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2020-07-09 21:39:40 Re: Is this a bug in pg_current_logfile() on Windows?
Previous Message Andrew Dunstan 2020-07-09 20:11:08 Re: Is this a bug in pg_current_logfile() on Windows?