Re: Bug in ProcArrayApplyRecoveryInfo for snapshots crossing 4B, breaking replicas

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Bug in ProcArrayApplyRecoveryInfo for snapshots crossing 4B, breaking replicas
Date: 2022-01-24 21:45:48
Message-ID: 35291484-f007-52c5-18a6-80c345f5c2ed@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/24/22 22:28, Bossart, Nathan wrote:
> On 1/22/22, 4:43 PM, "Tomas Vondra" <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>> There's a bug in ProcArrayApplyRecoveryInfo, introduced by 8431e296ea,
>> which may cause failures when starting a replica, making it unusable.
>> The commit message for 8431e296ea is not very clear about what exactly
>> is being done and why, but the root cause is that at while processing
>> RUNNING_XACTS, the XIDs are sorted like this:
>>
>> /*
>> * Sort the array so that we can add them safely into
>> * KnownAssignedXids.
>> */
>> qsort(xids, nxids, sizeof(TransactionId), xidComparator);
>>
>> where "safely" likely means "not violating the ordering expected by
>> KnownAssignedXidsAdd". Unfortunately, xidComparator compares the values
>> as plain uint32 values, while KnownAssignedXidsAdd actually calls
>> TransactionIdFollowsOrEquals() and compares the logical XIDs :-(
>
> Wow, nice find.
>
>> This likely explains why we never got any reports about this - most
>> systems probably don't leave transactions running for this long, so the
>> probability is much lower. And replica restarts are generally not that
>> common events either.
>
> I'm aware of one report with the same message [0], but I haven't read
> closely enough to determine whether it is the same issue. It looks
> like that particular report was attributed to backup_label being
> removed.
>

Yeah, I saw that thread too, and I don't think it's the same issue. As
you say, it seems to be caused by the backup_label shenanigans, and
there's also the RUNNING_XACTS message:

Sep 20 15:00:27 ... CONTEXT: xlog redo Standby/RUNNING_XACTS: nextXid
38585 latestCompletedXid 38571 oldestRunningXid 38572; 14 xacts: 38573
38575 38579 38578 38574 38581 38580 38576 38577 38572 38582 38584 38583
38583

The XIDs don't cross the 4B boundary at all, so this seems unrelated.

>> Attached patch is fixing this by just sorting the XIDs logically. The
>> xidComparator is meant for places that can't do logical ordering. But
>> these XIDs come from RUNNING_XACTS, so they actually come from the same
>> wraparound epoch (so sorting logically seems perfectly fine).
>
> The patch looks reasonable to me.
>

Thanks!

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2022-01-24 21:47:28 Re: fairywren is generating bogus BASE_BACKUP commands
Previous Message Robert Haas 2022-01-24 21:43:31 Re: [BUG]Update Toast data failure in logical replication