Re: [HACKERS] Possibly too stringent Assert() in b-tree code

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amit(dot)kapila16(at)gmail(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, simon(at)2ndquadrant(dot)com, ah(at)cybertec(dot)at, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Possibly too stringent Assert() in b-tree code
Date: 2018-06-26 09:47:48
Message-ID: 20180626.184748.29200427.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello. I came from the following mail.

https://www.postgresql.org/message-id/flat/0F97FA9ABBDBE54F91744A9B37151A5116E4DD(at)g01jpexmbkw24#8be5cc9b7c3cf454a82e561d84f4118f

At Mon, 26 Sep 2016 09:12:04 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote in <CAA4eK1K5YyDmndko0zzW6WNCN_DGFVHa6DCYcyuvkBWTH5+nUQ(at)mail(dot)gmail(dot)com>
> On Mon, Sep 26, 2016 at 8:54 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > On Sun, Sep 25, 2016 at 9:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> >>> On Fri, Sep 23, 2016 at 12:21 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>>> This is clearly an oversight in Simon's patch fafa374f2, which introduced
> >>>> this code without any consideration for the possibility that the page
> >>>> doesn't have a valid special area. ...
> >>>> but I'm not very clear on whether this is a safe fix, mainly because
> >>>> I don't understand what the reuse WAL record really accomplishes.
> >>>> Maybe we need to instead generate a reuse record with some special
> >>>> transaction ID indicating worst-case assumptions?
> >>
> >>> I think it is basically to ensure that the queries on standby should
> >>> not be considering the page being recycled as valid and if there is
> >>> any pending query to which this page is visible, cancel it. If this
> >>> understanding is correct, then I think the fix you are proposing is
> >>> correct.
> >>
> >> After thinking about it some more, I do not understand why we are emitting
> >> WAL here at all in *any* case, either for a new page or for a deleted one
> >> that we're about to recycle. Why wouldn't the appropriate actions have
> >> been taken when the page was deleted, or even before that when its last
> >> tuple was deleted?
> >>
> >
> > It seems to me that we do take actions for conflict resolution during
> > the page deletion (that looks to be covered by XLOG_HEAP2_CLEANUP_INFO
> > which we emit in vacuum), but not sure if that is sufficient.
> > Consider a case where the new transaction is started on standby after
> >
>
> Here by new transaction, I intend to say some newer snapshot with
> valid MyPgXact->xmin.

I agree to the diagnosis. So the WAL record is not necessary if
it is a new page since no one cannot be grabbing it.

_bt_page_recyclable() has two callers. _bt_getbuf and
btvacuumpage. A comment in btvacuumpage is saying that.

> * _bt_checkpage(), which will barf on an all-zero page. We want to
> * recycle all-zero pages, not fail. Also, we want to use a nondefault

So it is right thing for _bt_page_recyclable() to return true for
zeroed pages and it is consistent with the comment in
_bt_page_recyclable().

At the problematic point, a new page is safe to recycle but
there's no need to issue WAL record since it is not recycled with
the meaning that _bt_log_resuse_page() thinking.

> _bt_log_reuse_page(Relation rel, BlockNumber blkno, TransactionId latestRemovedXid)
> {
> xl_btree_reuse_page xlrec_reuse;
>
> /*
> * Note that we don't register the buffer with the record, because this
> * operation doesn't modify the page. This record only exists to provide a
> * conflict point for Hot Standby.
> */

I think we have all requred testimony from exiting code and
comments. FWIW my conclustion is that Tom's solution upthread is
right thing to do and needs addition in comment.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
fix_bt_getbuf_newpage_crash.patch text/x-patch 865 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2018-06-26 09:49:53 Re: ssl_library parameter
Previous Message Peter Eisentraut 2018-06-26 09:46:43 Re: We still claim "cannot begin/end transactions in PL/pgSQL"