Re: [HACKERS] A design for amcheck heapam verification

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] A design for amcheck heapam verification
Date: 2018-01-11 10:14:06
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I like heapam verification functionality and use it right now. So, I'm planning to provide review for this patch, probably, this week.

From my current use I have some thoughts on interface. Here's what I get.

# select bt_index_check('messagefiltervalue_group_id_59490523e6ee451f',true);
ERROR: XX001: heap tuple (45,21) from table "messagefiltervalue" lacks matching index tuple within index "messagefiltervalue_group_id_59490523e6ee451f"
HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error.
LOCATION: bt_tuple_present_callback, verify_nbtree.c:1316
Time: 45.668 ms

# select bt_index_check('messagefiltervalue_group_id_59490523e6ee451f');

(1 row)
Time: 32.873 ms

# select bt_index_parent_check('messagefiltervalue_group_id_59490523e6ee451f');
ERROR: XX002: down-link lower bound invariant violated for index "messagefiltervalue_group_id_59490523e6ee451f"
DETAIL: Parent block=6259 child index tid=(1747,2) parent page lsn=4A0/728F5DA8.
LOCATION: bt_downlink_check, verify_nbtree.c:1188
Time: 391194.113 ms

Seems like new check is working 4 orders of magnitudes faster then bt_index_parent_check() and still finds my specific error that bt_index_check() missed.
From this output I see that there is corruption, but cannot understand:
1. What is the scale of corruption
2. Are these corruptions related or not

I think an interface to list all or top N error could be useful.

> 14 дек. 2017 г., в 0:02, Peter Geoghegan <pg(at)bowt(dot)ie> написал(а):
>> This could also test the reproducibility of the tests with a fixed
>> seed number and at least two rounds, a low number of elements could be
>> more appropriate to limit the run time.
> The runtime is already dominated by pg_regress overhead. As it says in
> the README, using a fixed seed in the test harness is pointless,
> because it won't behave in a fixed way across platforms. As long as we
> cannot ensure deterministic behavior, we may as well fully embrace
> non-determinism.
I think that determinism across platforms is not that important as determinism across runs.

Thanks for the amcheck! It is very useful.

Best regards, Andrey Borodin.

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuro Yamada 2018-01-11 10:14:33 Minor code improvement to estimate_path_cost_size in postgres_fdw
Previous Message Masahiko Sawada 2018-01-11 10:10:50 Re: [HACKERS] Creating backup history files for backups taken from standbys