Re: WIP: BRIN multi-range indexes

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2020-09-11 18:05:10
Message-ID: 20200911180510.hjqxdpgfpy3damqn@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 11, 2020 at 10:08:15AM -0400, John Naylor wrote:
>On Fri, Sep 11, 2020 at 6:14 AM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
>> I understand. I just feel a bit uneasy about replacing an index with
>> something that may or may not be better for a certain use case. I mean,
>> if you have data set for which regular minmax works fine, wouldn't you
>> be annoyed if we just switched it for something slower?
>
>How about making multi minmax the default for new indexes, and those
>who know their data will stay very well correlated can specify simple
>minmax ops for speed? Upgraded indexes would stay the same, and only
>new ones would have the risk of slowdown if not attended to.
>

That might work, I think. I like that it's an explicit choice, i.e. we
may change what the default opclass is, but the behavior won't change
unexpectedly during REINDEX etc. It might still be a bit surprising
after dump/restore, but that's probably fine.

It would be ideal if the opclasses were binary compatible, allowing a
more seamless transition. Unfortunately that seems impossible, because
plain minmax uses two Datums to store the range, while multi-minmax uses
a more complex structure.

>Also, I wonder if the slowdown in building a new index is similar to
>the slowdown for updates. I'd like to run some TCP-H tests (that will
>take some time).
>

It might be, because it needs to deserialize/serialize the summary too,
and there's no option to amortize the costs over many inserts. OTOH the
insert probably needs to do various other things, so maybe it's won't be
that bad. But yeah, testing and benchmarking it would be nice. Do you
plan to test just the minmax-multi opclass, or will you look at the
bloom one too?

Attached is a slightly improved version - I've merged the various pieces
into the "main" patches, and made some minor additional optimizations.
I've left the cost tweak as a separate part for now, though.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Pass-all-keys-to-BRIN-consistent-function--20200911b.patch text/plain 20.8 KB
0002-Move-IS-NOT-NULL-checks-to-bringetbitmap-20200911b.patch text/plain 9.9 KB
0003-Move-processing-of-NULLs-from-BRIN-support-20200911b.patch text/plain 16.1 KB
0004-BRIN-bloom-indexes-20200911b.patch text/plain 137.4 KB
0005-BRIN-minmax-multi-indexes-20200911b.patch text/plain 217.4 KB
0006-tweak-costing-for-bloom-minmax-multi-index-20200911b.patch text/plain 3.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-09-11 18:09:30 Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)
Previous Message Peter Geoghegan 2020-09-11 17:41:01 Re: [patch] _bt_binsrch* improvements - equal-prefix-skip binary search