Re: WIP: BRIN multi-range indexes

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Mark Dilger <hornschnorter(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2020-08-04 15:17:43
Message-ID: 20200804151743.dbobvzovtnpgfpn7@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 04, 2020 at 05:36:51PM +0300, Alexander Korotkov wrote:
>Hi, Tomas!
>
>Sorry for the late reply.
>
>On Sun, Jul 19, 2020 at 6:19 PM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> I think there's a number of weak points in this approach.
>>
>> Firstly, it assumes the summaries can be represented as arrays of
>> built-in types, which I'm not really sure about. It clearly is not true
>> for the bloom opclasses, for example. But even for minmax oclasses it's
>> going to be tricky because the ranges may be on different data types so
>> presumably we'd need somewhat nested data structure.
>>
>> Moreover, multi-minmax summary contains either points or intervals,
>> which requires additional fields/flags to indicate that. That further
>> complicates the things ...
>>
>> maybe we could decompose that into separate arrays or something, but
>> honestly it seems somewhat premature - there are far more important
>> aspects to discuss, I think (e.g. how the ranges are built/merged in
>> multi-minmax, or whether bloom opclasses are useful at all).
>
>I see. But there is at least a second option to introduce a new
>datatype with just an output function. In the similar way
>gist/tsvector_ops uses gtsvector key type. I think it would be more
>transparent than using just bytea. Also, this is the way we already
>use in the core.
>

So you're proposing to have a new data types "brin_minmax_multi_summary"
and "brin_bloom_summary" (or some other names), with output functions
printing something nicer? I suppose that could work, and we could even
add pageinspect functions returning the value as raw bytea.

Good idea!

>> >BTW, I've applied the patchset to the current master, but I got a lot
>> >of duplicate oids. Could you please resolve these conflicts. I think
>> >it would be good to use high oid numbers to evade conflicts during
>> >development/review, and rely on committer to set final oids (as
>> >discussed in [1]).
>> >
>> >Links
>> >1. https://www.postgresql.org/message-id/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-%3DeaochT0dd2BN9CQ%40mail.gmail.com
>>
>> Did you use the patchset from 2020/07/03? I don't get any duplicate OIDs
>> with it, and it's already using quite high OIDs (part 4 uses >= 8000,
>> part 5 uses >= 9000).
>
>Yep, it appears that I was using the wrong version of patchset.
>Patchset from 2020/07/03 works good on the current master.
>

OK, good.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-08-04 15:18:37 Re: LSM tree for Postgres
Previous Message Tomas Vondra 2020-08-04 15:11:36 Re: LSM tree for Postgres