Re: PATCH: standby crashed when replay block which truncated in standby but failed to truncate in master node

From: Andres Freund <andres(at)anarazel(dot)de>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thunder <thunder1(at)126(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: PATCH: standby crashed when replay block which truncated in standby but failed to truncate in master node
Date: 2020-02-03 13:49:18
Message-ID: 20200203134918.rtyclvh2u2ny7bto@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-01-21 15:41:54 +0900, Fujii Masao wrote:
> On 2020/01/21 13:39, Michael Paquier wrote:
> > On Tue, Jan 21, 2020 at 08:45:14AM +0530, Amit Kapila wrote:
> > > The original email doesn't say so. I might be missing something, but
> > > can you explain what makes you think so.
> >
> > Oops. Incorrect thread, I was thinking about this one previously:
> > https://www.postgresql.org/message-id/822113470.250068.1573246011818@connect.xfinity.com
> >
> > Re-reading the root of the thread, I am still not sure what we could
> > do, as that's rather tricky.

Did anybody consider the proposal at
https://www.postgresql.org/message-id/20191223005430.yhf4n3zr4ojwbcn2%40alap3.anarazel.de ?
I think we're going to have to do something like that to actually fix
the problem, rather than polish around the edges.

> See here:
> https://www.postgresql.org/message-id/20190927061414.GF8485@paquier.xyz

On 2019-09-27 15:14:14 +0900, Michael Paquier wrote:
> Wrapping the call of smgrtruncate() within RelationTruncate() to use a
> critical section would make things worse from the user perspective on
> the primary, no? If the physical truncation fails, we would still
> fail WAL replay on the standby, but instead of generating an ERROR in
> the session of the user attempting the TRUNCATE, the whole primary
> would be taken down.

FWIW, to me this argument just doesn't make any sense - even if a few
people have argued it.

A failure in the FS truncate currently yields to a cluster in a
corrupted state in multiple ways:
1) Dirty buffer contents were thrown away, and going forward their old
contents will be read back.
2) We have WAL logged something that we haven't done. That's *obviously*
something *completely* violating WAL logging rules. And break WAL
replay (including locally, should we crash before the next
checkpoint - there could be subsequent WAL records relying on the
block's existance).

That's so obviously worse than a PANIC restart, that I really don't
understand the "worse from the user perspective" argument from your
email above. Obviously it sucks that the error might re-occur during
recovery. But that's something that usually actually can be fixed -
whereas the data corruption can't.

> The original proposal, i.e., holding the interrupts during
> the truncation, is worth considering? It is not a perfect
> solution but might improve the situation a bit.

I don't think it's useful in isolation.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-02-03 14:04:49 Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Previous Message Arseny Sher 2020-02-03 13:46:05 Re: ERROR: subtransaction logged without previous top-level txn record