Re: WIP: BRIN multi-range indexes

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2021-03-22 05:27:32
Message-ID: ea369ea5-0d61-bbee-c1c7-bfe17728bab1@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/17/21 7:59 PM, John Naylor wrote:
> On Thu, Mar 11, 2021 at 12:26 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com <mailto:tomas(dot)vondra(at)enterprisedb(dot)com>>
> wrote:
>>
>> Hi,
>>
>> Here is an updated version of the patch series.
>>
>> It fixes the assert failure (just remove the multiplication from it) and
>> adds a simple regression test that exercises this.
>>
>> Based on the discussion so far, I've decided to keep just the new
>> signature of the consistent function. That's a bit simpler than having
>> to support both 3 and 4 parameters, and it would not deal with the NULL
>> changes anyway (mostly harmless code duplication, but still). I've also
>> realized the API documentation in SGML needs updating.
>>
>> At this point, I think 0001-0006 parts are mostly committable.
>
> I think so. I've marked it RFC for this six.
>

I was just about to commit the 0001-0006 over the weekend. I went over
the patches to polish them, most of the changes were pretty simple:

- minor cleanup / rewording of comments

- resolving two remaining FIXMEs in the minmax-multi patch

- removing/rewording a bunch of XXX comments (most of them are about
possible future improvements)

- assigned proper OIDs and bumped catversion in patches touching the
catalogs

- bugfix: 0005 and 0006 were still adding fields into BrinOptions and
BrinDesc, but that was used before we got opclass parameters

- bugfix: two or three corrections in catalog contents

- doc fix: the brin.sgml bloom was referring to bit_bloom_ops, but
there's no such thing

- I have considered to get rid of 0004 (essentially merging it into 0002
and 0003 patches) but I decided not to do that, as it'd make the changes
in those two patches less obvious.

But then I remembered that long time ago there were suggestions to not
include the new opclasses directly, but through contrib. I'm not sure if
we want to do that, but I decided to give it a try - attached are the
polished patches 0001-0006, along with two patches that move the bloom
and minmax-multi opclasses to contrib.

In general, it seems much easier to define the opclasses in extension as
opposed to doing it directly in src/include/catalog. It however probably
needs a bit more work - in particular, the extensions currently create
just operator classes, not operator families. Not sure what consequences
could that have, though.

All the regression tests work fine, with the exception of minmax-multi
on a CIDR column. I don't know why, but the CREATE INDEX then fails like
this:

ERROR: missing operator 1(650,650) in opfamily 16463

650 is cidr, so this essentially says there's no (cidr<cidr) operator.
With the opclasses defined in .dat files this however worked, so I
suppose it's related to the missing operator families.

There's one remaining problem, though - the opclasses are using custom
data types for the range summaries. Essentially a varlena data type,
with internal structure. When defined directly in core, we knew the OID
of that type, which allowed us to store it into BrinOpcInfo. That's
mostly internal information, but pageinspect used it to print the
summary in brin_page_items (by just using the type out function).
Obviously, without the OID it prints just bytea blob, which is not very
useful :-(

I wonder if we could lookup the type, somehow. I mean, we know the name
of the types (brin_bloom_summary/brin_minmax_multi_summary), so perhaps
we could look it up in TYPENAMENSP? Not sure where to get the namespace,
though, and maybe there are (security) issues with this? Or maybe we
could pass the type OID to pageinspect directly (we're already passing
OID of the index, so we already have to trust the value).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Move-bsearch_arg-to-src-port-20210322.patch text/x-patch 7.4 KB
0002-Pass-all-scan-keys-to-BRIN-consistent-funct-20210322.patch text/x-patch 24.1 KB
0003-Move-IS-NOT-NULL-handling-from-BRIN-support-20210322.patch text/x-patch 23.1 KB
0004-Optimize-allocations-in-bringetbitmap-20210322.patch text/x-patch 4.9 KB
0005-BRIN-bloom-indexes-20210322.patch text/x-patch 128.2 KB
0006-BRIN-minmax-multi-indexes-20210322.patch text/x-patch 242.3 KB
0007-move-bloom-to-contrib-20210322.patch text/x-patch 79.1 KB
0008-move-minmax-multi-to-contrib-20210322.patch text/x-patch 119.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mahendra Singh Thalor 2021-03-22 05:28:17 Re: Can we remove extra memset in BloomInitPage, GinInitPage and SpGistInitPage when we have it in PageInit?
Previous Message Fujii Masao 2021-03-22 05:07:43 Re: Wrong statistics for size of XLOG_SWITCH during pg_waldump.