Re: Standby corruption after master is restarted

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: michael(at)paquier(dot)xyz
Cc: emre(at)hasegeli(dot)com, tomas(dot)vondra(at)2ndquadrant(dot)com, pgsql-bugs(at)postgresql(dot)org, gurkan(dot)gur(at)innogames(dot)com, david(dot)pusch(at)innogames(dot)com, patrick(dot)schmidt(at)innogames(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Standby corruption after master is restarted
Date: 2018-04-27 00:49:08
Message-ID: 20180427.094908.00748548.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Thank you for noticing me of that. Is there any way to know how a
bug report has been concluded? Or should I search -hackers for
a corresponding thread?

At Thu, 26 Apr 2018 21:13:48 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in <20180426121348(dot)GA2365(at)paquier(dot)xyz>
> On Thu, Apr 26, 2018 at 07:53:04PM +0900, Kyotaro HORIGUCHI wrote:
> > I think this behavior is a bug. XLogReadRecord is considering the
> > case but palloc_extended() breaks it. So in the attached, add a
> > new flag MCXT_ALLOC_NO_PARAMERR to palloc_extended() and
> > allocate_recordbuf calls it with the new flag. That alone fixes
> > the problem. However, the patch frees read state buffer facing
> > errorneous records since such records can leave a too-large
> > buffer allocated.
>
> This problem is already discussed here:
> https://commitfest.postgresql.org/18/1516/
>
> And here is the thread:
> https://www.postgresql.org/message-id/flat/0A3221C70F24FB45833433255569204D1F8B57AD(at)G01JPEXMBYT05
>
> Tsunakawa-san and I discussed a couple of approaches. Extending
> palloc_extended so as an incorrect length does not result in an error is
> also something that crossed by mind, but the length handling is
> different on the backend and the frontend, so I discarded the idea you
> are proposing here and instead relied on a check with AllocSizeIsValid,
> which gives a more simple patch:
> https://www.postgresql.org/message-id/20180314052753.GA16179%40paquier.xyz

Yeah, perhaps all approaches in the thread came to my mind but
choosed different one. I'm fine with the approach in the thread.

> This got no interest from committers yet unfortunately, but I think that
> this is a real problem which should be back-patched :(

Several other WAL-related fixes are also waiting to be picked up..

regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2018-04-27 01:04:11 Re: Standby corruption after master is restarted
Previous Message Петър Славов 2018-04-26 22:26:52 Re: BUG #15114: logical decoding Segmentation fault

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-04-27 00:59:51 Re: jitflags in _outPlannedStmt and _readPlannedStmt treated as bool type
Previous Message Amit Langote 2018-04-27 00:32:39 Re: description of root_tuple_slot missing