Re: amcheck (B-Tree integrity checking tool)

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Kevin Grittner <kgrittn(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Subject: Re: amcheck (B-Tree integrity checking tool)
Date: 2016-11-17 01:06:24
Message-ID: CAEepm=2-TMFWTURjYmVckpMvfCmd4pUSygDyQsyONChKsUzM2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 8, 2016 at 6:44 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Fri, Sep 2, 2016 at 11:19 AM, Kevin Grittner <kgrittn(at)gmail(dot)com> wrote:
>> IMV the process is to post a patch to this list to certify that it
>> is yours to contribute and free of IP encumbrances that would
>> prevent us from using it. I will wait for that post.
>
> I attach my V3

+ * Ideally, we'd compare every item in the index against every other
+ * item in the index, and not trust opclass obedience of the transitive
+ * law to bridge the gap between children and their grandparents (as
+ * well as great-grandparents, and so on). We don't go to those
+ * lengths because that would be prohibitively expensive, and probably
+ * not markedly more effective in practice.
+ */

I skimmed the Modern B-Tree Techniques paper recently, and there was a
section on "the cousin problem" when verifying btrees, which this
comment reminded me of. I tried to come up with an example of a pair
of characters being switched in a collation that would introduce
undetectable corruption:

T1. Order = a < b < c < d

Btree = [a|c]
/ \
/ \
/ \
[a]-------[c]
| |
| |
[b]-------[d]

T2. Order = a < c < b < d

Now c and b have been switched around. Won't amcheck still pass since
we only verify immediate downlinks and siblings? Yet searches for b
will take a wrong turn at the root. Do I have this right? I wonder
if there is a way to verify that each page's high key < the 'next' key
for all ancestors.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-11-17 01:10:26 Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Previous Message Andres Freund 2016-11-17 00:36:28 Re: Password identifiers, protocol aging and SCRAM protocol