Re: txid_status returns NULL for recently commited transactions

From: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, dsarafannikov(at)yandex(dot)ru
Subject: Re: txid_status returns NULL for recently commited transactions
Date: 2018-09-26 13:42:59
Message-ID: CANtu0ohzMJmjoTr_N1Ni1-YuwRdxQ0PuDufbGbQ4aqnV43h0vw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

Got some new information.

There are 6 replicas and master in cluster. I rebooted two replicas... And
they started to work as expected!

So, on master and 4 untouched replicas:
txid_current() -> 4459388265
txid_status(BIGINT '4459388265') -> NULL

On two rebooted replicas:
txid_status(BIGINT '4459388265') -> 'commited'

All replicas are in sync with master.

Root cause is ShmemVariableCache.oldestClogXid value, as described in
previous message.

On master and 4 replicas:
(gdb) p ShmemVariableCache.oldestClogXid
$13 = 2207340131
(gdb) p ShmemVariableCache.oldestXid
$11 = 3764954191

On two rebooted replicas:
(gdb) p ShmemVariableCache.oldestClogXid
$14 = 3764954191
(gdb) p ShmemVariableCache.oldestXid
$12 = 3764954191

SELECT datname, age(datfrozenxid), datfrozenxid FROM pg_database;
template0 252790897 4207340131
postgres 302786659 4157344369
template1 302786564 4157344464
project 695176837 3764954191

As far as I remember master and replicas were not rebooted after upgrading
from 9.6 to 10. So, maybe issue is upgrade-related.

вт, 25 сент. 2018 г. в 22:22, Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>:

> Hi, thanks for the reply!
>
>
> > What are you using it for?
>
> I want to use it to validate status of related entities in other database
> (queue) in short interval after PG transaction commit/rollback.
>
>
> > I can't reproduce that...
>
> Yes, it happens only with one cluster. All others work as expected.
>
>
> > Your mailer appears to do very annoying things by converting numbers to
> phone numbers.
>
> Sorry.
>
>
> > It's from the last epoch. Plain xids are 32bit wide, the epochs deal
> > with values that are bigger. And 2207340131 <(220)%20734-0131> is less
> than 2^31 in the
> > past.
>
> Yes, and probably it is cause of the issue.
>
> ShmemVariableCache->oldestClogXid = 2207340131 <(220)%20734-0131>
>
> xid_epoch = 1
> xid = 150227913
> TransactionIdPrecedes(xid, ShmemVariableCache->oldestClogXid)) return TRUE
> , then TransactionIdInRecentPast return FALSE and txtd_status return NULL.
>
> But xid (1) and xid_epoch (150227913) are correct values from my active
> (or recently commited) transaction.
>
> >> SELECT txid_status(BIGINT '4294967295') -> 'commited'.
> >> SELECT txid_status(BIGINT '4294967296') -> NULL
> > Why do you think that is the wrong result?
>
> Let's leave it for now (maybe my misunderstanding). I think it is better
> to deal with "txid_status(txid_current()) -> NULL" issue first.
>
> Thanks,
> Michail.
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-09-26 14:07:03 Re: transction_timestamp() inside of procedures
Previous Message Bruce Momjian 2018-09-26 12:52:07 Re: transction_timestamp() inside of procedures