Re: recovering from "found xmin ... from before relfrozenxid ..."

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, MBeena Emerson <mbeena(dot)emerson(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."
Date: 2020-10-29 04:00:30
Message-ID: 20201029040030.p4osrmaywhqaesd4@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-10-28 19:09:14 -0700, Andres Freund wrote:
> On 2020-10-28 18:13:44 -0700, Andres Freund wrote:
> > Just pushed this. Let's see what the BF says...
>
> It says that apparently something is unstable about my new test. It
> first passed on a few animals, but then failed a lot in a row. Looking.

The differentiating factor is force_parallel_mode=regress.

Ugh, this is nasty: The problem is that we can end up computing the
horizons the first time before MyDatabaseId is even set. Which leads us
to compute a too aggressive horizon for plain tables, because we skip
over them, as MyDatabaseId still is InvalidOid:

/*
* Normally queries in other databases are ignored for anything but
* the shared horizon. But in recovery we cannot compute an accurate
* per-database horizon as all xids are managed via the
* KnownAssignedXids machinery.
*/
if (in_recovery ||
proc->databaseId == MyDatabaseId ||
proc->databaseId == 0) /* always include WalSender */
h->data_oldest_nonremovable =
TransactionIdOlder(h->data_oldest_nonremovable, xmin);

That then subsequently leads us consider a row fully dead in
heap_hot_search_buffers(). Triggering the killtuples logic. Causing the
test to fail.

With force_parallel_mode=regress we constantly start parallel workers,
which makes it much more likely that this case is hit.

It's trivial to fix luckily...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2020-10-29 04:01:59 Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding
Previous Message Fujii Masao 2020-10-29 03:38:08 Re: document pg_settings view doesn't display custom options