Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
Date: 2023-01-08 03:09:56
Message-ID: 20230108030956.5q6ptk3boeisibjq@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-01-07 16:29:23 -0800, Andres Freund wrote:
> It's probably not too hard to fix specifically in this one place - we could
> just clamp vacuum_defer_cleanup_age to be smaller than latest_completed, but
> it strikes me as as a somewhat larger issue for the 64it xid infrastructure. I
> suspect this might not be the only place running into problems with such
> "before the universe" xids.

I haven't found other problematic places in HEAD, but did end up find a less
serious version of this bug in < 14: GetFullRecentGlobalXmin(). I did verify
that with vacuum_defer_cleanup_age set GetFullRecentGlobalXmin() returns
values that look likely to cause problems. Its "just" used in gist luckily.

It's hard to find places that do this kind of arithmetic, we traditionally
haven't had a helper for it. So it's open-coded in various ways.

xidStopLimit = xidWrapLimit - 3000000;
if (xidStopLimit < FirstNormalTransactionId)
xidStopLimit -= FirstNormalTransactionId;

and oddly:
xidVacLimit = oldest_datfrozenxid + autovacuum_freeze_max_age;
if (xidVacLimit < FirstNormalTransactionId)
xidVacLimit += FirstNormalTransactionId;

or (in < 14):

RecentGlobalXmin = globalxmin - vacuum_defer_cleanup_age;
if (!TransactionIdIsNormal(RecentGlobalXmin))
RecentGlobalXmin = FirstNormalTransactionId;

The currently existing places I found, other than the ones in procarray.c,
luckily don't seem to convert the xids to 64bit xids.

> For a bit I was thinking that we should introduce the notion that a
> FullTransactionId can be from the past. Specifically that when the upper 32bit
> are all set, we treat the lower 32bit as a value from before xid 0 using the
> normal 32bit xid arithmetic. But it sucks to add overhead for that
> everywhere.
>
> It might be a bit more palatable to designate one individual value,
> e.g. 2^32-1<<32, as a "before xid 0" indicator - it doesn't matter how far
> before the start of the universe an xid point to...

On IM Thomas suggested we could reserve the 2^32-1 epoch for invalid values. I
hacked up a patch that converts various fxid functions to inline functions
with such assertions, and it indeed quickly catches the problem this thread
reported, close to the source of the use.

One issue with that is is that it'd reduce what can be input for the xid8
type. But it's hard to believe that'd be a real issue?

It's quite unfortunate that we don't have a test for vacuum_defer_cleanup_age
yet :(.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-01-08 03:25:49 Re: pgsql: Delay commit status checks until freezing executes.
Previous Message Amit Kapila 2023-01-08 02:14:03 Re: Perform streaming logical transactions by background workers and parallel apply