Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence
Date: 2019-12-30 23:57:35
Message-ID: CAH2-Wzm5xYgCG0SYaQb-h-DYJVbcbEqzJZh8R0yEFMET5VKbYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 24, 2019 at 4:29 AM Anastasia Lubennikova
<a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
> I updated the patchset.
> The first patch now contains only infrastructure changes
> and the second one sets opcisbitwise for btree opclasses in pg_opclass.dat.

We should try to formally define what we're trying to represent about
B-Tree opclasses here -- the definition of
"opcisbitwise"/preciseness/whatever should be tightened up. In
particular, it should be clear how the "image" binary row comparators
[1] (i.e. "operator *= equality" stuff) fit in. This new concept
should be defined in terms of that existing concept -- we're talking
about exactly the same variety of "internal binary equality" here, I
think.

I propose that we adopt the following definition: For an operator
class to be safe, its equality operator has to always agree with
datum_image_eq() (i.e. two datums must be bitwise equal after
detoasting).

(Maybe we should say something about "operator *= equality" as well
(or instead), since that is already documented in [1].)

We may also want to say something about foreign keys in this formal
definition of "opcisbitwise"/preciseness. Discussion around the bug
fixed by commit 1ffa59a85cb [1] showed that there was plenty of
confusion in this area. Commit 1ffa59a85cb simply solved the problem
that existed with foreign keys, without "joining the dots". It reused
the rowtypes.c "operator *= equality" stuff to fix the problem, but
only in an ad-hoc and undoumented way. Let's not do that again now.

Note: In theory this definition is stricter than truly necessary to
make deduplication safe, because we can imagine a contrived case in
which an operator class exists where datum_image_eq() does not always
agree with the equality operator, even though the equality operator
will reliably consider two datums to be equal only when they have
identical outputs from the underlying type's output function. This
could happen when an operator class author wasn't very careful about
zeroing padding -- this may not have mattered to the opclass author
because nobody relied on that padding anyway. I think that stuff like
this is not worth worrying about -- it can only happen because the
datatype/operator class author was very sloppy.

Note also: We seem to make this assumption already. Maybe this
uninitialized bytes side issue doesn't even need to be pointed out or
discussed. The comment just above VALGRIND_CHECK_MEM_IS_DEFINED()
within PageAddItemExtended() seems to suggest this. The comment
specifically mentions datumIsEqual() (not datum_image_eq()), but it's
exactly the same issue.

[1] https://www.postgresql.org/docs/devel/functions-comparisons.html#COMPOSITE-TYPE-COMPARISON
[2] https://www.postgresql.org/message-id/flat/3326fc2e-bc02-d4c5-e3e5-e54da466e89a%402ndquadrant.com

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2019-12-31 00:08:54 remove separate postgres.(sh)description files
Previous Message Andrew Dunstan 2019-12-30 23:55:26 Re: [PATCH] Increase the maximum value track_activity_query_size