Re: pgstattuple "unexpected zero page" for gist and hash indexes

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Nitin Motiani <nitinmotiani(at)google(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pgstattuple "unexpected zero page" for gist and hash indexes
Date: 2025-09-30 07:28:08
Message-ID: aNuGiAPtrYLCY9TA@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 29, 2025 at 09:03:13AM +0530, Dilip Kumar wrote:
> On Tue, Sep 16, 2025 at 1:06 PM Nitin Motiani <nitinmotiani(at)google(dot)com> wrote:
>> For hash indexes, this error comes from pgstat_hash_page() ->
>> _hash_getbuf_with_strategy() -> _hash_checkpage() for a zero page.
>> Similarly gistcheckpage() is the cause for gist indexes.
>>
>> Since after a crash recovery a zero page is normal (as long as not
>> referenced from the rest of the index), emitting
>> ERRCODE_INDEX_CORRUPTED is a false alarm.
>
> I agree this seems like a false alarm, and at least its inconsistent
> with the btree.

It also means that if one is doing a join of pg_class with pgstattuple
to gather some diagnostics on a given database, then one is required
to wrap pgstattuple into a custom plpgsql function to catch errors,
like you did, or avoid these indexes in all cases. This is not
user-friendly in both cases.

On Tue, Sep 30, 2025 at 10:46:11AM +0530, Nitin Motiani wrote:
> On Mon, Sep 29, 2025 at 9:03 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>> I observed that pgstat_btree_page() incorporates the count of new
>> pages into its free space calculation[1]. Does it make sense to do the
>> same for hash and gist as well as we are leaning towards making these
>> consistent.
>>
>> [1]
>> if (PageIsNew(page))
>> {
>> /* fully empty page */
>> stat->free_space += BLCKSZ;
>> }
>
> Thanks for the feedback. Yes, it makes sense to do the same free space
> calculation for the other index types unless there is some special
> handling in btree implementation of the free space. From the
> pgstattuple code, I don't see any special handling so I have made the
> change suggested by you. I'm attaching patch v2 with that change.

Including the space for the new page for gist in the size makes sense
here. gist has protections for empty pages in VACUUM, and can grab
empty pages through the FSM.

For hash, we don't have much of that AFAIK, but your argument with the
timely crash validates the fact that we'd better include this number
in the free size as it is part of the relation anyway.

In short, fine by me for both. If others have an opinion, feel free.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-09-30 07:37:25 Re: [BUG]: the walsender does not update its IO statistics until it exits
Previous Message 龙小龙 2025-09-30 07:16:16 Re: When creating index, why pointing to old version of tuple