Re: WIP: BRIN multi-range indexes

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Mark Dilger <hornschnorter(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2020-08-04 14:36:51
Message-ID: CAPpHfdt0YW-+-HzWPmwxCrQBsuc1J+VOOdzg0akuP_XYoxk_cA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Tomas!

Sorry for the late reply.

On Sun, Jul 19, 2020 at 6:19 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> I think there's a number of weak points in this approach.
>
> Firstly, it assumes the summaries can be represented as arrays of
> built-in types, which I'm not really sure about. It clearly is not true
> for the bloom opclasses, for example. But even for minmax oclasses it's
> going to be tricky because the ranges may be on different data types so
> presumably we'd need somewhat nested data structure.
>
> Moreover, multi-minmax summary contains either points or intervals,
> which requires additional fields/flags to indicate that. That further
> complicates the things ...
>
> maybe we could decompose that into separate arrays or something, but
> honestly it seems somewhat premature - there are far more important
> aspects to discuss, I think (e.g. how the ranges are built/merged in
> multi-minmax, or whether bloom opclasses are useful at all).

I see. But there is at least a second option to introduce a new
datatype with just an output function. In the similar way
gist/tsvector_ops uses gtsvector key type. I think it would be more
transparent than using just bytea. Also, this is the way we already
use in the core.

> >BTW, I've applied the patchset to the current master, but I got a lot
> >of duplicate oids. Could you please resolve these conflicts. I think
> >it would be good to use high oid numbers to evade conflicts during
> >development/review, and rely on committer to set final oids (as
> >discussed in [1]).
> >
> >Links
> >1. https://www.postgresql.org/message-id/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-%3DeaochT0dd2BN9CQ%40mail.gmail.com
>
> Did you use the patchset from 2020/07/03? I don't get any duplicate OIDs
> with it, and it's already using quite high OIDs (part 4 uses >= 8000,
> part 5 uses >= 9000).

Yep, it appears that I was using the wrong version of patchset.
Patchset from 2020/07/03 works good on the current master.

------
Regards,
Alexander Korotkov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-08-04 14:59:31 Re: new heapcheck contrib module
Previous Message Alexander Korotkov 2020-08-04 14:27:09 Re: Concurrency bug in amcheck