Re: Possibly too stringent Assert() in b-tree code

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Possibly too stringent Assert() in b-tree code
Date: 2016-09-24 13:06:49
Message-ID: CAA4eK1JGM0216ZhQbSR58Sp9rgMiB=fg+SLv8Er_fqwZ+aVZcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 23, 2016 at 12:21 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, Sep 19, 2016 at 7:07 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> I think you have a valid point. It seems we don't need to write WAL
>>> for reuse page (aka don't call _bt_log_reuse_page()), if the page is
>>> new, as the only purpose of that log is to handle conflict based on
>>> transaction id stored in special area which will be anyway zero.
>
>> +1.
>
> This is clearly an oversight in Simon's patch fafa374f2, which introduced
> this code without any consideration for the possibility that the page
> doesn't have a valid special area. We could prevent the crash by
> doing nothing if PageIsNew, a la
>
> if (_bt_page_recyclable(page))
> {
> /*
> * If we are generating WAL for Hot Standby then create a
> * WAL record that will allow us to conflict with queries
> * running on standby.
> */
> - if (XLogStandbyInfoActive() && RelationNeedsWAL(rel))
> + if (XLogStandbyInfoActive() && RelationNeedsWAL(rel) &&
> + !PageIsNew(page))
> {
> BTPageOpaque opaque = (BTPageOpaque) PageGetSpecialPointer(page);
>
> _bt_log_reuse_page(rel, blkno, opaque->btpo.xact);
> }
>
> /* Okay to use page. Re-initialize and return it */
>
> but I'm not very clear on whether this is a safe fix, mainly because
> I don't understand what the reuse WAL record really accomplishes.
> Maybe we need to instead generate a reuse record with some special
> transaction ID indicating worst-case assumptions?
>

I think it is basically to ensure that the queries on standby should
not be considering the page being recycled as valid and if there is
any pending query to which this page is visible, cancel it. If this
understanding is correct, then I think the fix you are proposing is
correct.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-09-24 13:07:21 Re: Parallel tuplesort (for parallel B-Tree index creation)
Previous Message Peter Geoghegan 2016-09-24 13:03:07 Re: Parallel tuplesort (for parallel B-Tree index creation)