Re: recovering from "found xmin ... from before relfrozenxid ..."

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, MBeena Emerson <mbeena(dot)emerson(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."
Date: 2020-10-29 04:00:30
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2020-10-28 19:09:14 -0700, Andres Freund wrote:
> On 2020-10-28 18:13:44 -0700, Andres Freund wrote:
> > Just pushed this. Let's see what the BF says...
> It says that apparently something is unstable about my new test. It
> first passed on a few animals, but then failed a lot in a row. Looking.

The differentiating factor is force_parallel_mode=regress.

Ugh, this is nasty: The problem is that we can end up computing the
horizons the first time before MyDatabaseId is even set. Which leads us
to compute a too aggressive horizon for plain tables, because we skip
over them, as MyDatabaseId still is InvalidOid:

* Normally queries in other databases are ignored for anything but
* the shared horizon. But in recovery we cannot compute an accurate
* per-database horizon as all xids are managed via the
* KnownAssignedXids machinery.
if (in_recovery ||
proc->databaseId == MyDatabaseId ||
proc->databaseId == 0) /* always include WalSender */
h->data_oldest_nonremovable =
TransactionIdOlder(h->data_oldest_nonremovable, xmin);

That then subsequently leads us consider a row fully dead in
heap_hot_search_buffers(). Triggering the killtuples logic. Causing the
test to fail.

With force_parallel_mode=regress we constantly start parallel workers,
which makes it much more likely that this case is hit.

It's trivial to fix luckily...


Andres Freund

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2020-10-29 04:01:59 Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding
Previous Message Fujii Masao 2020-10-29 03:38:08 Re: document pg_settings view doesn't display custom options