Re: WIP: BRIN multi-range indexes

From: Sascha Kuhl <yogidabanli(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Mark Dilger <hornschnorter(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2020-07-11 13:37:43
Message-ID: CAPvVvKDiVEQNznNDOmkJSO4N6D4TtFJ7X5ud92zMVhC=NdOa3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry, my topic is different

Sascha Kuhl <yogidabanli(at)gmail(dot)com> schrieb am Sa., 11. Juli 2020, 15:32:

>
>
>
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> schrieb am Sa., 11. Juli
> 2020, 13:24:
>
>> On Fri, Jul 10, 2020 at 04:44:41PM +0200, Sascha Kuhl wrote:
>> >Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> schrieb am Fr., 10. Juli
>> 2020,
>> >14:09:
>> >
>> >> On Fri, Jul 10, 2020 at 06:01:58PM +0900, Masahiko Sawada wrote:
>> >> >On Fri, 3 Jul 2020 at 09:58, Tomas Vondra <
>> tomas(dot)vondra(at)2ndquadrant(dot)com>
>> >> wrote:
>> >> >>
>> >> >> On Sun, Apr 05, 2020 at 08:01:50PM +0300, Alexander Korotkov wrote:
>> >> >> >On Sun, Apr 5, 2020 at 8:00 PM Tomas Vondra
>> >> >> ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> >> >> ...
>> >> >> >> >
>> >> >> >> >Assuming we're not going to get 0001-0003 into v13, I'm not so
>> >> >> >> >inclined to rush on these three as well. But you're willing to
>> >> commit
>> >> >> >> >them, you can count round of review on me.
>> >> >> >> >
>> >> >> >>
>> >> >> >> I have no intention to get 0001-0003 committed. I think those
>> changes
>> >> >> >> are beneficial on their own, but the primary reason was to
>> support
>> >> the
>> >> >> >> new opclasses (which require those changes). And those parts are
>> not
>> >> >> >> going to make it into v13 ...
>> >> >> >
>> >> >> >OK, no problem.
>> >> >> >Let's do this for v14.
>> >> >> >
>> >> >>
>> >> >> Hi Alexander,
>> >> >>
>> >> >> Are you still interested in reviewing those patches? I'll take a
>> look at
>> >> >> 0001-0003 to check that your previous feedback was addressed. Do you
>> >> >> have any comments about 0004 / 0005, which I think are the more
>> >> >> interesting parts of this series?
>> >> >>
>> >> >>
>> >> >> Attached is a rebased version - I realized I forgot to include 0005
>> in
>> >> >> the last update, for some reason.
>> >> >>
>> >> >
>> >> >I've done a quick test with this patch set. I wonder if we can improve
>> >> >brin_page_items() SQL function in pageinspect as well. Currently,
>> >> >brin_page_items() is hard-coded to support only normal brin indexes.
>> >> >When we pass brin-bloom or brin-multi-range to that function the
>> >> >binary values are shown in 'value' column but it seems not helpful for
>> >> >users. For instance, here is an output of brin_page_items() with a
>> >> >brin-multi-range index:
>> >> >
>> >> >postgres(1:12801)=# select * from brin_page_items(get_raw_page('mul',
>> >> >2), 'mul');
>> >> >-[ RECORD 1
>> >>
>> ]----------------------------------------------------------------------------------------------------------------------
>> >>
>> >>
>> >-----------------------------------------------------------------------------------------------------------------------------------
>> >> >----------------------------
>> >> >itemoffset | 1
>> >> >blknum | 0
>> >> >attnum | 1
>> >> >allnulls | f
>> >> >hasnulls | f
>> >> >placeholder | f
>> >> >value |
>> >>
>> {\x010000001b0000002000000001000000e5700000e6700000e7700000e8700000e9700000ea700000eb700000ec700000ed700000ee700000ef
>> >>
>> >>
>> >700000f0700000f1700000f2700000f3700000f4700000f5700000f6700000f7700000f8700000f9700000fa700000fb700000fc700000fd700000fe700000ff700
>> >> >00000710000}
>> >> >
>> >>
>> >> Hmm. I'm not sure we can do much better, without making the function
>> >> much more complicated. I mean, even with regular BRIN indexes we don't
>> >> really know if the value is plain min/max, right?
>> >>
>> >You can be sure with the next node. The value is in can be false positiv.
>> >The value is out is clear. You can detect the change between in and out.
>> >
>>
>> I'm sorry, I don't understand what you're suggesting. How is any of this
>> related to false positive rate, etc?
>>
>
> Hi,
>
> You check by the bloom filter if a value you're searching is part of the
> node, right?
>
> In case, the value is in the bloom filter you could be mistaken, because
> another value could have the same hash profile, no?
>
> However if the value is out, the filter can not react. You can be sure
> that the value is out.
>
> If you looking for a range or many ranges of values, you traverse many
> nodes. By knowing the value is out, you can state a clear set of nodes that
> form the range. However the border is somehow unsharp because of the false
> positives.
>
> I am not sure if we write about the same. Please confirm, this can be
> needed. Please.
>
> I will try to understand what you write. Interesting
>
> Sascha
>
>>
>> The problem here is that while plain BRIN opclasses have fairly simple
>> summary that can be stored using a fixed number of simple data types
>> (e.g. minmax will store two values with the same data types as the
>> indexd column)
>>
>> result = palloc0(MAXALIGN(SizeofBrinOpcInfo(2)) +
>> sizeof(MinmaxOpaque));
>> result->oi_nstored = 2;
>> result->oi_opaque = (MinmaxOpaque *)
>> MAXALIGN((char *) result + SizeofBrinOpcInfo(2));
>> result->oi_typcache[0] = result->oi_typcache[1] =
>> lookup_type_cache(typoid, 0);
>>
>> The opclassed introduced here have somewhat more complex summary, stored
>> as a single bytea value - which is what gets printed by brin_page_items.
>>
>> To print something easier to read (for humans) we'd either have to teach
>> brin_page_items about the diffrent opclasses (multi-range, bloom) end
>> how to parse the summary bytea, or we'd have to extend the opclasses
>> with a function formatting the summary. Or rework how the summary is
>> stored, but that seems like the worst option.
>>
>>
>> regards
>>
>> --
>> Tomas Vondra http://www.2ndQuadrant.com
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-07-11 13:58:08 Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
Previous Message Sascha Kuhl 2020-07-11 13:32:43 Re: WIP: BRIN multi-range indexes