Re: Assert in pageinspect with NULL pages

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Daria Lepikhova <d(dot)lepikhova(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Assert in pageinspect with NULL pages
Date: 2022-03-27 20:26:26
Message-ID: CAH2-Wzk4czia7nPJaYQPfpecnQH6L6gRrO4ZC7dj3sOFa1D-JQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 27, 2022 at 7:24 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> It's my impression that there are many ways of crashing the system
> using pageinspect right now. We aren't very careful about making sure
> that our code that reads from disk doesn't crash if the data on disk
> is corrupted, and all of that systemic weakness is inherited by
> pageinspect.

It's significantly worse than the core code, I'd say (before we even
consider the fact that you can supply your own pages). At least with
index AMs, where functions like _bt_checkpage() and gistcheckpage()
provide the most basic sanity checks for any page that is read from
disk.

> Fixing this seems like a lot of work and problematic for various
> reasons. And I do agree that if we can allow people to look at what's
> going on with a corrupt page, that's useful. On the other hand, if by
> "looking" they crash the system, that sucks a lot. So overall I think
> we need a lot more sanity checks here.

I don't think it's particularly hard to do a little more. That's all
it would take to prevent practically all crashes. We're not dealing
with adversarial page images here.

Sure, it would be overkill to fully adapt something like
palloc_btree_page() for pageinspect, for the reason Michael gave. But
there are a couple of checks that it does that would avoid practically
all real world hard crashes, without any plausible downside:

* The basic _bt_checkpage() call that every nbtree page uses in the
core code (as well as in amcheck).

* The maxoffset > MaxIndexTuplesPerPage check.

You could even go a bit further, and care about individual index
tuples. My commit 93ee38eade added some basic sanity checks for index
tuples to bt_page_items(). That could go a bit further as well. In
particular, the sanity checks from amcheck's PageGetItemIdCareful()
function could be carried over. That would make it impossible for
bt_page_items() to read past the end of the page when given a page
that has an index tuple whose item pointer offset has some wildly
unreasonable value.

I'm not volunteering. Just saying that this is quite possible.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2022-03-27 20:44:23 Re: Small TAP tests cleanup for Windows and unused modules
Previous Message Tom Lane 2022-03-27 20:19:53 Re: Frontend error logging style