Re: Usage of epoch in txid_current

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Usage of epoch in txid_current
Date: 2019-03-25 04:01:07
Message-ID: CA+hUKGJdoJCT2n83+cGCeez0TiRV8gEoJYVQbeF_uzKE1Xkq8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 4, 2019 at 8:41 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2018-09-19 13:58:36 +1200, Thomas Munro wrote:
>
> > +/*
> > + * Advance nextFullXid to the value after a given xid. The epoch is inferred.
> > + * If lock_free_check is true, then the caller must be sure that it's safe to
> > + * read nextFullXid without holding XidGenLock (ie during recovery).
> > + */
> > +void
> > +AdvanceNextFullTransactionIdPastXid(TransactionId xid, bool lock_free_check)
> > +{
> > + TransactionId current_xid;
> > + uint32 epoch;
> > +
> > + if (lock_free_check &&
> > + !TransactionIdFollowsOrEquals(xid,
> > + XidFromFullTransactionId(ShmemVariableCache->nextFullXid)))
> > + return;
> > +
> > + LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
> > + current_xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
> > + if (TransactionIdFollowsOrEquals(xid, current_xid))
> > + {
> > + epoch = EpochFromFullTransactionId(ShmemVariableCache->nextFullXid);
> > + if (xid < current_xid)
> > + ++epoch; /* epoch wrapped */
> > + ShmemVariableCache->nextFullXid = MakeFullTransactionId(epoch, xid);
> > + FullTransactionIdAdvance(&ShmemVariableCache->nextFullXid);
> > + }
> > + LWLockRelease(XidGenLock);
> > }
>
> Is this really a good idea? Seems like it's going to perpetuate the
> problematic epoch logic we had before?

We could get rid of this in future, if certain WAL records and 2PC
state start carrying full xids. That would be much bigger surgery
than I wanted to do in this patch. This is the only place that
converts 32 bit -> 64 bit with an inferred epoch component. I have
explained why I think that's correct in new comments along with a new
assertion.

The theory is that the xids encountered in recovery and 2PC startup
can never be too far out of phase with the current nextFullXid. In
contrast, the original epoch tracking from commit 35af5422 wasn't
bounded in the same sort of way. Certainly no other code should be
allowed to do this sort of thing without very careful consideration of
how the epoch is bounded. The patch deliberately provides no general
purpose make-me-a-FullTransactionId-by-guessing-the-epoch facility.

While reviewing this, I also realised that the "lock_free_check"
function argument was unnecessary. The only place that was setting
that to false, namely RecordKnownAssignedTransactionIds(), might as
well just use the same behaviour. I've now rewritten
AdvanceNextFullTransactionIdPastXid() completely in light of all that;
please review.

> > printf(_("Latest checkpoint's full_page_writes: %s\n"),
> > ControlFile.checkPointCopy.fullPageWrites ? _("on") : _("off"));
> > printf(_("Latest checkpoint's NextXID: %u:%u\n"),
> > - ControlFile.checkPointCopy.nextXidEpoch,
> > - ControlFile.checkPointCopy.nextXid);
> > + EpochFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid),
> > + XidFromFullTransactionId(ControlFile.checkPointCopy.nextFullXid));

> I still think it's a mistake to not display the full xid here, and
> rather perpetuate the epoch stuff. I'm ok with splitting that into a
> separate commit, but this ought to be fixed in the same release imo.

Ok.

> > +/*
> > + * A 64 bit value that contains an epoch and a TransactionId. This is
> > + * wrapped in a struct to prevent implicit conversion to/from TransactionId.
> > + * Allowing such conversions seems likely to be error-prone.
> > + */
> > +typedef struct FullTransactionId
> > +{
> > + uint64 value;
> > +} FullTransactionId;
>
> Probably good to note here that not all values are valid normal xids.

Done.

> > +static inline FullTransactionId
> > +MakeFullTransactionId(uint32 epoch, TransactionId xid)
> > +{
> > + FullTransactionId result;
> > +
> > + result.value = ((uint64) epoch) << 32 | xid;
> > +
> > + return result;
> > +}
>
> Make sounds a bit like it's allocating...

Changed to FullTransactionIdFromEpochAndXid().

> > + dest->value++;
> > + if (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
> > + *dest = MakeFullTransactionId(EpochFromFullTransactionId(*dest),

> Hm, this seems pretty odd coding to me. Why not do something like
>
> dest->value++;
> while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
> dest->value++;
>
> That'd a) be a bit more readable, b) possible to do in a lockfree way at
> a later point.

Done.

> > - TransactionId nextXid; /* copy of ShmemVariableCache->nextXid */
> > + TransactionId nextXid; /* xid from ShmemVariableCache->nextFullXid */
>
> Probably worthwhile to pgindent partially.

Done.

I also added FullTransactionId to typedefs.list, bumped
PG_CONTROL_VERSION and did some peformance testing of tight loops of
GetNewTransactionId(), which showed no measurable change (~10 million
single-threaded calls per second either way on the machine I tested).

New version attached. I'd like to commit this for PG12.

--
Thomas Munro
https://enterprisedb.com

Attachment Content-Type Size
0001-Track-the-next-transaction-ID-using-64-bits-v7.patch application/octet-stream 48.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-03-25 04:07:33 Re: Removing unneeded self joins
Previous Message Amit Kapila 2019-03-25 03:23:27 Re: Error message inconsistency