From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
---|---|
To: | Nitin Motiani <nitinmotiani(at)google(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: pgstattuple "unexpected zero page" for gist and hash indexes |
Date: | 2025-09-29 03:33:13 |
Message-ID: | CAFiTN-sjz2+7FpXheAQag81y_3FsaQjwzaTTTUi6kmVhwqQh5A@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 16, 2025 at 1:06 PM Nitin Motiani <nitinmotiani(at)google(dot)com> wrote:
>
> For hash indexes, this error comes from pgstat_hash_page() ->
> _hash_getbuf_with_strategy() -> _hash_checkpage() for a zero page.
> Similarly gistcheckpage() is the cause for gist indexes.
>
> Since after a crash recovery a zero page is normal (as long as not
> referenced from the rest of the index), emitting
> ERRCODE_INDEX_CORRUPTED is a false alarm.
I agree this seems like a false alarm, and at least its inconsistent
with the btree.
> To fix this, we propose that we remove the checks from
> pgstat_hash_page() and pgstat_gist_page(). This is something which
> pgstat_btree_page() already does. It uses a lower-level function
> ReadBufferExtended() and avoids using _bt_getbuf() which would check
> for the "unexpected zero page".
>
> I'm attaching a patch for the above which does the following :
>
> 1. Replaces _hash_getbuf_with_strategy() with ReadBufferExtended() in
> pgstat_hash_page(). Then for non-PageIsNew() pages, we explicitly
> check the PageGetSpecialSize().
>
> 2. Similarly gistcheckpage() is removed in pgstat_gist_page(). And for
> non-PageIsNew(), we explicitly check the PageGetSpecialSize() before
> checking if it's a leaf page.
>
> I confirmed that this was working by running the above mentioned
> simulation steps along with the patch.
Yeah that seems fine to me.
> Note that _hash_getbuf_with_strategy() also checks if blkno is P_NEW.
> But that check seems unnecessary here as P_NEW is the same as
> InvalidBlockNo and pgstat_hash_page() is called by pgstat_index() for
> the block nos starting from HASH_METAPAGE + 1 and remains <
> RelationGetNumberOfBlocks().
Right, we don't need that check here.
I observed that pgstat_btree_page() incorporates the count of new
pages into its free space calculation[1]. Does it make sense to do the
same for hash and gist as well as we are leaning towards making these
consistent.
[1]
if (PageIsNew(page))
{
/* fully empty page */
stat->free_space += BLCKSZ;
}
--
Regards,
Dilip Kumar
Google
From | Date | Subject | |
---|---|---|---|
Next Message | shveta malik | 2025-09-29 03:48:55 | Re: Clear logical slot's 'synced' flag on promotion of standby |
Previous Message | Peter Smith | 2025-09-29 03:28:12 | Re: Skipping schema changes in publication |