Re: new heapcheck contrib module

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2021-02-24 18:55:28
Message-ID: B36B01A3-1958-495B-BA05-167046B4A773@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Feb 24, 2021, at 10:40 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Feb 23, 2021 at 12:38 PM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>> This is changed in v40 as you propose to exit on FATAL and PANIC level errors and on error to send a query. On lesser errors (which includes all corruption reports about btrees and some heap corruption related errors), the slot's connection is still useable, I think. Are there cases where the error is lower than FATAL and yet the connection needs to be reestablished? It does not seem so from the testing I have done, but perhaps I'm not thinking of the right sort of non-fatal error?
>
> I think you should assume that if you get an ERROR you can - and
> should - continue to use the connection, but still exit non-zero at
> the end. Perhaps one can contrive some scenario where that's not the
> case, but if the server does the equivalent of "ERROR: session
> permanently borked" we should really change those to FATAL; I think
> you can discount that possibility.

Ok, that's how I had it, so no changes necessary.

>> In v40, exit(1) means the program encountered fatal errors leading it to stop, and exit(2) means that a non-fatal error and/or corruption reports occurred somewhere during the processing. Otherwise, exit(0) means your database was successfully checked and is healthy.

Other changes in v40 per our off-list discussions but not related to your on-list review comments:

Removed option --no-tables.

Removed option --no-dependents. This was a synonym for the combination of --exclude-toast and --exclude-indexes, but having such a synonym isn't all that helpful.

Renamed --exclude-toast to --no-toast-expansion and changed its behavior a bit. Likewise, renamed --exclude-indexes to --no-index-expansion and change behavior. The behavioral changes are that these options now only have the effect of not automatically expanding the list of relations to check to include toast or indexes associated with relations already in the list. The prior names didn't exclusively mean that, and the behavior didn't exclusively do that.

Updated the docs per your other review email.

Implemented --progress to behave much more like how it does in pg_basebackup.

Attachment Content-Type Size
v40-0001-Reworking-ParallelSlots-for-mutliple-DB-use.patch application/octet-stream 24.7 KB
v40-0002-Adding-contrib-module-pg_amcheck.patch application/octet-stream 130.5 KB
v40-0003-Extending-PostgresNode-to-test-corruption.patch application/octet-stream 16.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexandre Arruda 2021-02-24 18:59:53 Re: [Proposal] Global temporary tables
Previous Message Robert Haas 2021-02-24 18:40:29 Re: new heapcheck contrib module