Re: new heapcheck contrib module

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: new heapcheck contrib module
Date: 2020-04-29 19:06:54
Message-ID: F845AC50-A479-43F3-ADFB-861A4D2AD287@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Apr 29, 2020, at 11:41 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Apr 22, 2020 at 10:43 PM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>> It's simple enough to extend the tap test a little to check for those things. In v3, the tap test skips tests if the page size is not 8k, and also if the tuples do not fall on the page where expected (which would happen due to alignment issues, gremlins, or whatever.).
>
> Skipping the test if the tuple isn't in the expected location sounds
> really bad. That will just lead to the tests passing without actually
> doing anything. If the tuple isn't in the expected location, the tests
> should fail.
>
>> There are other approaches, though. The HeapFile/HeapPage/HeapTuple perl modules recently submitted on another thread *could* be used here, but only if those modules are likely to be committed.
>
> Yeah, I don't know if we want that stuff or not.
>
>> This test *could* be extended to autodetect the page size and alignment issues and calculate at runtime where tuples will be on the page, but only if folks don't mind the test having that extra complexity in it. (There is a school of thought that regression tests should avoid excess complexity.). Do you have a recommendation about which way to go with this?
>
> How much extra complexity are we talking about?

The page size is easy to query, and the test already does so, skipping if the answer isn't 8k. The test could recalculate offsets based on the pagesize rather than skipping the test easily enough, but the MAXALIGN stuff is a little harder. I don't know (perhaps someone would share?) how to easily query that from within a perl test. So the test could guess all possible alignments that occur in the real world, read from the page at the offset that alignment would create, and check if the expected datum is there. The test would have to be careful to avoid false positives, by placing data before and after the datum being checked with bit patterns that cannot be misinterpreted as a match. That level of complexity seems unappealing, at least to me. It's not hard to write, but maintaining stuff like that is an unwelcome burden.

> It feels to me like
> for a heap page, the only things that are going to affect the position
> of the tuples on the page -- supposing we know the tuple size -- are
> the page size and, I think, MAXALIGN, and that doesn't sound too bad.
> Another possibility is to use pageinspect's heap_page_items() to
> determine the position within the page (lp_off), which seems like it
> might simplify things considerably. Then, we're entirely relying on
> the backend to tell us where the tuples are, and we only need to worry
> about the offsets relative to the start of the tuple.
>
> I kind of like that approach, because it doesn't involve having Perl
> code that knows how heap pages are laid out; we rely entirely on the C
> code for that. I'm not sure if it'd be a problem to have a TAP test
> for one contrib module that uses another contrib module, but maybe
> there's some way to figure that problem out.

Yeah, I'll give this a try.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-04-29 19:12:25 Re: Additional Chapter for Tutorial
Previous Message Robert Haas 2020-04-29 18:56:37 Re: new heapcheck contrib module