Re: Non-deterministic IndexTuple toast compression from index_form_tuple() + amcheck false positives

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Non-deterministic IndexTuple toast compression from index_form_tuple() + amcheck false positives
Date: 2019-01-14 21:46:32
Message-ID: CAH2-WznfZmAmmEW+55YuMmc0AamL+1OOJB3iYZ3Yu3Wej4z9qQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 14, 2019 at 1:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)bowt(dot)ie> writes:
> > The heapallindexed enhancement that made it into Postgres 11 assumes
> > that the representation of index tuples produced by index_form_tuple()
> > (or all relevant index_form_tuple() callers) is deterministic: for
> > every possible heap tuple input there must be a single possible
> > (bitwise) output.
>
> That assumption seems unbelievably fragile. How badly do things
> break when it's violated?

Well, they break. You get a false positive report of corruption, since
there isn't a bitwise identical version of the datum from the heap in
the index for that same tuple. This seems to be very unlikely in
practice, but amcheck is concerned with unlikely outcomes.

> Also, is the assumption just that a fixed source tuple will generate
> identical index entries across repeated index_form_tuple attempts?

I would have said that the assumption is that a fixed source tuple
will generate identical index entries. The problem with that is that
my idea of what constitutes a fixed input now seems to have been
faulty. I didn't think that the executor could mutate TOAST state in a
way that made this kind of inconsistency possible.

> Or is it assuming that logically equal index entries will be bitwise
> equal? The latter is broken on its face, because index_form_tuple()
> doesn't try to hide differences in the toasting state of source
> datums.

Logical equality as I understand the term doesn't enter into it at all
-- B-Tree operator class semantics are not involved here. I'm not sure
if that's what you meant, but I want to be clear on that. amcheck
certainly knows that it cannot assume that scankey logical equality is
the same thing as bitwise equality.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-01-14 22:13:07 Re: explain plans with information about (modified) gucs
Previous Message Tom Lane 2019-01-14 21:31:21 Re: Non-deterministic IndexTuple toast compression from index_form_tuple() + amcheck false positives