Re: new heapcheck contrib module

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2020-05-14 00:36:36
Message-ID: CAH2-Wzkam3w_nSjQ78KtMODeonwfnq5fjsBpi8jH-Lsjrcr-ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 13, 2020 at 5:18 PM Mark Dilger
<mark(dot)dilger(at)enterprisedb(dot)com> wrote:
> I am not removing any assertions. I do not propose to remove any assertions. When I talk about "hardening against assertions", that is not in any way a proposal to remove assertions from the code.

I'm sorry if I seemed to suggest that you wanted to remove assertions,
rather than test more things earlier. I recognize that that could be a
useful thing to do, both in general, and maybe even in the specific
example you gave -- on general robustness grounds. At the same time,
it's something that can only be taken so far. It's probably not going
to make it practical to corrupt data in a regression test or tap test.

> There is a separate but related question in the offing about whether the backend code, independently of any amcheck contrib stuff, should be more paranoid in how it processes tuples to check for corruption.

I bet that there is something that we could do to be a bit more
defensive. Of course, we do a certain amount of that on general
robustness grounds already. A systematic review of that could be quite
useful. But as you point out, it's not really in scope here.

> > I would be willing to make a larger effort to avoid crashing a
> > backend, since that affects production. I might go to some effort to
> > not crash with downright adversarial inputs, for example. But it seems
> > inappropriate to take extreme measures just to avoid a crash with
> > extremely contrived inputs that will probably never occur.
>
> I think this is a misrepresentation of the tests that I've been running.

I didn't actually mean it that way, but I can see how my words could
reasonably be interpreted that way. I apologize.

> There are two kinds of tests that I have done:
>
> First, there is the regression tests, t/004_verify_heapam.pl, which is obviously contrived. That was included in the regression test suite because it needed to be something other developers could read, verify, "yeah, I can see why that would be corruption, and would give an error message of the sort the test expects", and then could be run to verify that indeed that expected error message was generated.

I still don't think that this is necessary. It could work for one type
of corruption, that happens to not have any of the problems, but just
testing that one type of corruption seems rather arbitrary to me.

> The second kind of corruption test I have been running is nothing more than writing random nonsense into randomly chosen locations within heap files and then running verify_heapam against those heap relations. It's much more Murphy than Machiavelli when it's just generated by calling random().

That sounds like a good initial test case, to guide your intuitions
about how to make the feature robust.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-05-14 00:51:41 Re: PG 13 release notes, first draft
Previous Message Mark Dilger 2020-05-14 00:18:40 Re: new heapcheck contrib module