Re: Reviewing freeze map code

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reviewing freeze map code
Date: 2016-07-07 18:04:37
Message-ID: 20160707180437.slrvpcqwssgzbgll@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-07-07 14:01:05 -0400, Robert Haas wrote:
> On Thu, Jul 7, 2016 at 1:58 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2016-07-07 10:37:15 -0400, Robert Haas wrote:
> >> On Wed, Jul 6, 2016 at 6:06 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >> > Hm. We can't easily do that in the back-patched version; because a
> >> > standby won't know to check for the flag . That's kinda ok, since we
> >> > don't yet need to clear all-visible yet at that point of
> >> > heap_update. But that better means we don't do so on the master either.
> >>
> >> Is there any reason to back-patch this in the first place?
> >
> > It seems not unlikely that this has caused corruption in the past; and
> > that we chalked it up to hardware corruption or something. Both toasting
> > and file extension frequently take extended amounts of time under load,
> > the window for crashing in the wrong moment isn't small...
>
> Yeah, that's true, but I'm having a bit of trouble imagining exactly
> we end up with corruption that actually matters. I guess a torn page
> could do it.

I think Noah pointed out a bad scenario: If we crash after putting the
xid in the page header, but before WAL logging, the xid could get reused
after the crash. By a different transaction. And suddenly the row isn't
visible anymore, after the reused xid commits...

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-07-07 18:05:48 Re: Bug in batch tuplesort memory CLUSTER case (9.6 only)
Previous Message Robert Haas 2016-07-07 18:04:36 Re: Issue with bgworker, SPI and pgstat_report_stat