Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Date: 2020-08-28 17:29:16
Message-ID: 20200828172916.4jzfpbo72mt6mwlf@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-08-28 12:37:17 -0400, Robert Haas wrote:
> On Mon, Jul 20, 2020 at 4:30 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > If we really were to do something like this the option would need to be
> > called vacuum_allow_making_corruption_worse or such. Its need to be
> > *exceedingly* clear that it will likely lead to making everything much
> > worse.
>
> I don't really understand this objection. How does letting VACUUM
> continue after problems have been detected make anything worse?

It can break HOT chains, plain ctid chains etc, for example. Which, if
earlier / follower tuples are removed can't be detected anymore at a
later time.

> The point is that when you make VACUUM fail, you not only don't
> advance relfrozenxid/relminmxid, but also don't remove dead tuples. In
> the long run, either thing will kill you, but it is not difficult to
> have a situation where failing to remove dead tuples kills you a lot
> faster. The table can just bloat until performance tanks, and then the
> application goes down, even if you still had 100+ million XIDs before
> you needed a wraparound vacuum.
>
> Honestly, I wonder why continuing (but without advancing relfrozenxid
> or relminmxid) shouldn't be the default behavior. I mean, if it
> actually corrupts your data, then it clearly shouldn't be, and
> probably shouldn't even be an optional behavior, but I still don't see
> why it would do that.

I think it's an EXTREMELY bad idea to enable anything like this by
default. It'll make bugs entirely undiagnosable, because we'll remove a
lot of the evidence of what the problem is. And we've had many long
standing bugs in this area, several only found because we actually
started to bleat about them. And quite evidently, we have more bugs to
fix in the area.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-08-28 17:39:27 Re: poc - possibility to write window function in PL languages
Previous Message Stephen Frost 2020-08-28 17:25:32 Re: New default role- 'pg_read_all_data'