Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence
Date: 2020-01-13 20:49:36
Message-ID: 771d014f-1b78-2770-4e97-c2413b889e77@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 31.12.2019 01:40, Peter Geoghegan wrote:
> On Mon, Dec 30, 2019 at 9:45 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> For example, float and numeric types are "never bitwise equal", while array,
>>> text, and other container types are "maybe bitwise equal". An array of
>>> integers
>>> or text with C collation can be treated as bitwise equal attributes, and it
>>> would be too harsh to restrict them from deduplication.
> We might as well support container types (like array) in the first
> Postgres version that has nbtree deduplication, I suppose. Even still,
> I don't think that it actually matters much to users. B-Tree indexes
> on arrays are probably very rare. Note that I don't consider text to
> be a container type here -- obviously btree/text_ops is a very
> important opclass for the deduplication feature. It may be the most
> important opclass overall.
>
> Recursively invoking a support function for the "contained" data type
> in the btree/array_ops support function seems like it might be messy.
> Not sure about that, though.
>
>>> What bothers me is that this option will unlikely be helpful on its own
>>> and we
>>> should also provide some kind of recheck function along with opclass, which
>>> complicates this idea even further and doesn't seem very clear.
>> It seems like the simplest thing might be to forget about the 'char'
>> column and just have a support function which can be used to assess
>> whether a given opclass's notion of equality is bitwise.
> I like the idea of relying only on a support function.

In attachment you can find the WIP patch that adds support function for
btree opclasses.
Before continuing, I want to ensure that I understood the discussion
above correctly.

Current version of the patch adds:

1) new syntax, which allow to provide support function:

CREATE OPERATOR CLASS int4_ops_test
FOR TYPE int4 USING btree AS
        OPERATOR 1 =(int4, int4),
        FUNCTION 1 btint4cmp(int4, int4),
        SUPPORT datum_image_eqisbitwise;

We probably can add more words to specify the purpose of the support
function.
Do you have any other objections about the place of this new element in
CreateOplcass syntax structure?

2) trivial support function that always returns true
'datum_image_eqisbitwise'.
It is named after 'datum_image_eq', because we define this support
function via its behavior.

If this prototype is fine, I will continue this work and add support
functions for other opclasses, update pg_dump and documentation.

Thoughts?

Attachment Content-Type Size
v5-WIP-Opclass-bitwise-equality-0001.patch text/x-patch 9.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-01-13 20:57:57 Re: Removing pg_pltemplate and creating "trustable" extensions
Previous Message Tom Lane 2020-01-13 20:38:59 Re: Removing pg_pltemplate and creating "trustable" extensions