Re: new heapcheck contrib module

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2020-08-03 03:17:25
Message-ID: 53B5B281-81BF-40FD-9903-EBAA5193506B@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jul 30, 2020, at 10:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> + curchunk = DatumGetInt32(fastgetattr(toasttup, 2,
> + ctx->toast_rel->rd_att, &isnull));
>
> Should we be worrying about the possibility of fastgetattr crapping
> out if the TOAST tuple is corrupted?

I think we should, but I'm not sure we should be worrying about it at this location. If the toast index is corrupt, systable_getnext_ordered could trip over the index corruption in the process of retrieving the toast tuple, so checking the toast tuple only helps if the toast index does not cause a crash first. I think the toast index should be checked before this point, ala verify_nbtree, so that we don't need to worry about that here. It might also make more sense to verify the toast table ala verify_heapam prior to here, so we don't have to worry about that here either. But that raises questions about whose responsibility this all is. If verify_heapam checks the toast table and toast index before the main table, that takes care of it, but makes a mess of the idea of verify_heapam taking a start and end block, since verifying the toast index is an all or nothing proposition, not something to be done in incremental pieces. If we leave verify_heapam as it is, then it is up to the caller to check the toast before the main relation, which is more flexible, but is more complicated and requires the user to remember to do it. We could split the difference by having verify_heapam do nothing about toast, leaving it up to the caller, but make pg_amcheck handle it by default, making it easier for users to not think about the issue. Users who want to do incremental checking could still keep track of the chunks that have already been checked, not just for the main relation, but for the toast relation, too, and give start and end block arguments to verify_heapam for the toast table check and then again for the main table check. That doesn't fix the question of incrementally checking the index, though.

Looking at it a slightly different way, I think what is being checked at the point in the code you mention is the logical structure of the toasted value related to the current main table tuple, not the lower level tuple structure of the toast table. We already have a function for checking a heap, namely verify_heapam, and we (or the caller, really) should be using that. The clean way to do things is

verify_heapam(toast_rel)
verify_btreeam(toast_idx)
verify_heapam(main_rel)

and then depending on how fast and loose you want to be, you can use the start and end block arguments, which are inherently a bit half-baked, given the lack of any way to be sure you check precisely the right range of blocks, and also you can be fast and loose about skipping the index check or not, as you see fit.

Thoughts?


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2020-08-03 03:43:53 Re: Document "59.2. Built-in Operator Classes" have a clerical error?
Previous Message osdba 2020-08-03 03:17:01 Document "59.2. Built-in Operator Classes" have a clerical error?