Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Subject: Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
Date: 2023-01-10 14:03:42
Message-ID: CAEze2Wgb54bkt1353pkuAbenYj84s=iZOiJ5wDgx6W3C3UFaYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 9 Jan 2023 at 20:34, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-01-09 17:50:10 +0100, Matthias van de Meent wrote:
> > Wouldn't it be enough to only fix the constructions in
> > FullXidRelativeTo() and widen_snapshot_xid() (as attached, $topic does
> > not occur with the patch), and (optionally) bump the first XID
> > available for any cluster to (FirstNormalXid + 1) to retain the 'older
> > than any running transaction' property?
>
> It's not too hard to fix in individual places, but I suspect that we'll
> introduce the bug in future places without some more fundamental protection.
>
> Locally I fixed it by clamping vacuum_defer_cleanup_age to a reasonable value
> in ComputeXidHorizons() and GetSnapshotData().

I don't think that clamping the value with oldestXid (as seen in patch
0001, in GetSnapshotData) is right.
It would clamp the value relative to the oldest frozen xid of all
databases, which can be millions of transactions behind oldestXmin,
and thus severely skew the amount of transaction's changes you keep on
disk (that is, until oldestXid moves past 1000_000).
A similar case can be made for the changes in ComputeXidHorizons - for
the described behaviour of vacuum_defer_cleanup_age we would need to
clamp the used offset separately for each of the fields in the horizon
result to retain all transaction data for the first 1 million
transactions, and the ones that may still see these transactions.

> Fixing it in FullXidRelativeTo() doesn't seem quite right, while it's ok to
> just return FirstNormalTransactionId in the case of vacuum_defer_cleanup_age,
> it doesn't see necessarily correct for other cases.

Understood.

Kind regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Takamichi Osumi (Fujitsu) 2023-01-10 14:11:49 RE: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message Fabien COELHO 2023-01-10 13:43:58 Re: pgbench - adding pl/pgsql versions of tests