Re: Assert in pageinspect with NULL pages

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Daria Lepikhova <d(dot)lepikhova(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Assert in pageinspect with NULL pages
Date: 2022-02-18 03:02:22
Message-ID: Yg8MPrV49/9Usqs1@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 17, 2022 at 05:40:41PM +0800, Julien Rouhaud wrote:
> About the patch, it's incorrectly using a hardcoded 8192 block-size rather than
> using the computed :block_size (or computing one when it's not already the
> case).

Well, the tests of pageinspect fail would already fail when using a
different page size than 8k, like the checksum ones :)

Anywa, I agree with your point that if this is not a reason to not
make the tests more portable if we can do easily.

> I'm also wondering if it wouldn't be better to return NULL rather than throwing
> an error for uninitialized pages.

Agreed that this is a sensible choice. NULL would be helpful for the
case where one compiles all the checksums of a relation with a full
scan based on the relation size, for example. All these behaviors
ought to be documented properly, as well. For SRFs, this should just
return no rows rather than generating an error.

> While at it, it could also be worthwhile to add tests for invalid blkno and
> block size in page_checksum().

If we are on that, we could also have tests for the various "input
page too small" errors. Now, these are just improvements, so I'd
rather treat this part separately of the issue reported, and add the
extra tests only on HEAD.

I can't help but notice that we should have a similar protection on
PageIsNew() in heap_page_items()? Now the code happens to work for an
empty page thanks to PageGetMaxOffsetNumber(), but this extra
protection would make the code more solid in the long-term IMO for the
full-zero case? It is worth noting that, in pageinspect, two of the
heap functions are the only ones to not be strict..

Something similar could be said about page_header() in rawpage.c,
though it is not wrong to return only zeros for a page full of zeros.

Shouldn't we similarly care about bt_metap() and
bt_page_stats_internal(), both callers of ReadBuffer(). The root
point is that the RBM_NORMAL mode of ReadBufferExtended() considers a
page full of zeros as valid, as per its top comments.

For the same reason, get_raw_page_internal() would work just fine and
return a page full of zeros.

Also, instead of an error in get_page_from_raw(), we should make sure
that all its callers are able to map with the case of a new page,
where we return 8kB worth of zeros from this inner routine.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joseph Koshakow 2022-02-18 03:04:05 Fix formatting of Interval output
Previous Message Justin Pryzby 2022-02-18 03:00:20 Re: Assert in pageinspect with NULL pages