Re: pg_amcheck contrib application

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_amcheck contrib application
Date: 2021-03-15 18:11:17
Message-ID: C2C142F3-F740-4197-B507-20B28B7395AC@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mar 15, 2021, at 10:04 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Looks like we're not quite out of the woods, as hornet and tern are
> still unhappy:
>
> # Failed test 'pg_amcheck excluding all corrupt schemas status (got 2 vs expected 0)'
> # at t/003_check.pl line 498.
>
> # Failed test 'pg_amcheck excluding all corrupt schemas stdout /(?^:^$)/'
> # at t/003_check.pl line 498.
> # 'heap table "db1"."pg_catalog"."pg_statistic", block 2, offset 1, attribute 27:
> # final toast chunk number 0 differs from expected value 1
> # heap table "db1"."pg_catalog"."pg_statistic", block 2, offset 1, attribute 27:
> # toasted value for attribute 27 missing from toast table
> # '
> # doesn't match '(?^:^$)'
> # Looks like you failed 2 tests of 60.
> [12:18:06] t/003_check.pl ...........
> Dubious, test returned 2 (wstat 512, 0x200)
> Failed 2/60 subtests
>
> These animals have somewhat weird alignment properties: MAXALIGN is 8
> but ALIGNOF_DOUBLE is only 4. I speculate that that is affecting their
> choices about whether an out-of-line TOAST value is needed, breaking
> this test case.

The pg_amcheck test case is not corrupting any pg_catalog tables, but contrib/amcheck/verify_heapam is complaining about a corruption in pg_catalog.pg_statistic.

The logic in verify_heapam only looks for a value in the toast table if the tuple it gets from the main table (in this case, from pg_statistic) has an attribute that claims to be toasted. The error message we're seeing that you quoted above simply means that no entry exists in the toast table. The bit about "final toast chunk number 0 differs from expected value 1" is super unhelpful, as what it is really saying is that there were no chunks found. I should submit a patch to not print that message in cases where the attribute is missing from the toast table.

Is it possible that pg_statistic really is corrupt here, and that this is not a bug in pg_amcheck? It's not like we've been checking for corruption in the build farm up till now. I notice that this test, as well as test 005_opclass_damage.pl, neglects to turn off autovacuum for the test node. So maybe the corruptions are getting propogated during autovacuum? This is just a guess, but I will submit a patch that turns off autovacuum for the test node shortly.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-03-15 18:12:54 Type of wait events WalReceiverWaitStart and WalSenderWaitForWAL
Previous Message Julien Rouhaud 2021-03-15 18:10:37 Re: REINDEX backend filtering