Sequences, txids, and serial order of transactions

From: Christian Ohler <ohler(at)shift(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Sequences, txids, and serial order of transactions
Date: 2016-06-12 02:03:55
Message-ID: CAOsiKE+1meYaQkowU0Czo4qxruLJrC0yE8aXN7aQ+vnxneV-AA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

we have a use case similar to auditing packages like pgMemento or Audit
Trigger 91plus – we are looking to keep an ordered history of certain write
transactions. I'm trying to understand the trade-offs between different
ways of getting that order, i.e., assigning numbers to transactions
(ideally strictly monotonic, modulo concurrency). All of our transactions
are serializable (for now).

The two main candidates I'm aware of are txid_current() or nextval() of a
sequence; but perhaps there are other mechanisms we should be considering.

Some observations and questions from my investigations so far (and please
correct me if any of this is wrong):

(1) managing a counter in a table would essentially eliminate concurrency,
so we're not too interested in that

(2) the orders produced by txid_current and a sequence can be different
(unsurprisingly). (If it was desirable to make them match, we could
probably do so by briefly holding a lock while we call both txid_current
and nextval – seems like this shouldn't limit concurrency too much. Or
would it? Is one of them potentially slow?)

(3) logical replication has mechanisms to keeps sequences in sync, but not
txid_current (unsurprisingly)

(4) behaviors like
http://permalink.gmane.org/gmane.comp.db.postgresql.bugs/35636 make me
think that monotonicity of txid_current is not something we should bet on

(5) Postgres can give us a "high watermark" ("no transactions with IDs
below this number are still in-flight") for txid_current (using
txid_snapshot_xmin(txid_current_snapshot())), but has no equivalent feature
for sequences

(6) neither txid_current nor a sequence give us a valid serial order of the
transactions

(7) given that we can't get a valid serial order, what guarantees can we
get from the ordering? I'm not entirely sure what to look for, but at a
minimum, it seems like we want writes that clobber each other to be
correctly ordered. Are they, for both txid_current and for sequences? My
guess was "yes" for txids (seems intuitive but just a guess) and "no" for
sequences (because
https://www.postgresql.org/docs/current/static/functions-sequence.html
mentions that sequences are non-transactional); but for sequences, I
couldn't immediately construct a counterexample and am wondering whether
that's by design. Specifically, it seems that Postgres acquires the
snapshot for the transaction (if it hasn't already) when I call nextval(),
and as long as the snapshot is acquired before the sequence is incremented,
I suspect that this guarantees ordering writes. Does it?

(8) ...and is the snapshot acquired before or after the increment? (Is it
acquired as soon as Postgres sees SELECT, before even evaluating
nextval()? I think that's what I'm seeing. Is that something we can rely
on, or should we SELECT txid_current_snapshot() before SELECT nextval() to
be on the safe side?)

(9) are there other important properties that one order satisfies but the
other doesn't, or that neither satisfies but that we should be aware of?

(3) and (4) seem like strong reasons to go with a sequence, as long as we
can live without (5) and figure out (7) and (8).

Any help appreciated,
Christian.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Quan Zongliang 2016-06-12 04:47:16 Fwd: [ANNOUNCE] pgAdmin 4 v1.0 Beta 1 Released
Previous Message Artur Zakirov 2016-06-11 20:40:00 Re: [pg_trgm] Making similarity(?, ?) < ? use an index