Quick Links

Re: why do hash index builds use smgrextend() for new splitpoint pages

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: why do hash index builds use smgrextend() for new splitpoint pages
Date:	2022-02-28 05:59:32
Message-ID:	CAA4eK1KbPo8+XtJf1Cc6rtLfwHYribSvCW=WRwggmeG8SP3c3w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Feb 26, 2022 at 9:17 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Fri, Feb 25, 2022 at 11:17 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Sat, Feb 26, 2022 at 3:01 AM Melanie Plageman
> > <melanieplageman(at)gmail(dot)com> wrote:
> > >
> > > Since _hash_alloc_buckets() WAL-logs the last page of the
> > > splitpoint, is it safe to skip the smgrimmedsync()? What if the last
> > > page of the splitpoint doesn't end up having any tuples added to it
> > > during the index build and the redo pointer is moved past the WAL for
> > > this page and then later there is a crash sometime before this page
> > > makes it to permanent storage. Does it matter that this page is lost? If
> > > not, then why bother WAL-logging it?
> > >
> >
> > I think we don't care if the page is lost before we update the
> > meta-page in the caller because we will try to reallocate in that
> > case. But we do care after meta page update (having the updated
> > information about this extension via different masks) in which case we
> > won't lose this last page because it would have registered the sync
> > request for it via sgmrextend before meta page update.
>
> and could it happen that during smgrextend() for the last page, a
> checkpoint starts and finishes between FileWrite() and
> register_dirty_segment(), then index build finishes, and then a crash
> occurs before another checkpoint completes the pending fsync for that
> last page?
>

Yeah, this seems to be possible and then the problem could be that
index's idea and smgr's idea for EOF could be different which could
lead to a problem when we try to get a new page via _hash_getnewbuf().
If this theory turns out to be true then probably, we can get an error
either because of disk full or the index might request a block that is
beyond EOF as determined by RelationGetNumberOfBlocksInFork() in
_hash_getnewbuf().

Can we try to reproduce this scenario with the help of a debugger to
see if we are missing something?

--
With Regards,
Amit Kapila.

In response to

Re: why do hash index builds use smgrextend() for new splitpoint pages at 2022-02-26 15:47:34 from Melanie Plageman

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Yura Sokolov	2022-02-28 06:01:49	Re: BufferAlloc: don't take two simultaneous locks
Previous Message	Bharath Rupireddy	2022-02-28 05:04:25	Re: Synchronizing slots from primary to standby