Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Date: 2020-08-03 14:20:15
Message-ID: CA+TgmoY8LAyGhZgg9=mhQT_3o+sOROEEQTdoNxmr3-t3dUfqTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 20, 2020 at 4:30 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I'm extremely doubtful this is a good idea. In all likelihood this will
> just exascerbate corruption.
>
> You cannot just stop freezing tuples, that'll lead to relfrozenxid
> getting *further* out of sync with the actual table contents. And you
> cannot just freeze such tuples, because that has a good chance of making
> deleted tuples suddenly visible, leading to unique constraint violations
> etc. Which will then subsequently lead to clog lookup errors and such.

I think that the behavior ought to be:

- If we encounter any damaged tuples (e.g. tuple xid < relfrozenxid),
we give up on advancing relfrozenxid and relminmxid. This vacuum won't
change them at all.

- We do nothing to the damaged tuples themselves.

- We can still prune pages, and we can still freeze tuples that do not
appear to be damaged.

This amounts to an assumption that relfrozenxid is probably sane, and
that there are individual tuples that are messed up. It's probably not
the right thing if relfrozenxid got overwritten with a nonsense value
without changing the table contents. But, I think it's difficult to
cater to all contingencies. In my experience, the normal problem here
is that there are a few tuples or pages in the table that somehow
escaped vacuuming for long enough that they contain references to XIDs
from before the last time relfrozenxid was advanced - so continuing to
do what we can to the rest of the table is the right thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2020-08-03 14:42:19 Re: new heapcheck contrib module
Previous Message Robert Haas 2020-08-03 13:46:23 Re: public schema default ACL