Re: WIP: BRIN multi-range indexes

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2021-02-09 14:46:41
Message-ID: CAFBsxsF_nTeG8EX9gUkkpiAHOpdU_OEKbvsK3BzeMSdJbERbjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 3, 2021 at 7:54 PM Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
wrote:
>
> [v-20210203]

Hi Tomas,

I have some random comments from reading the patch, but haven't gone into
detail in the newer aspects. I'll do so in the near future.

The cfbot seems to crash on this patch during make check, but it doesn't
crash for me. I'm not even sure what date that cfbot status is from.

> BLOOM
> -----

Looks good, but make sure you change the commit message -- it still refers
to sorted mode.

+ * not entirely clear how to distrubute the space between those columns.

s/distrubute/distribute/

> MINMAX-MULTI
> ------------

> c) 0007 - A hybrid approach, using a buffer that is multiple of the
> user-specified value, with some safety min/max limits. IMO this is what
> we should use, although perhaps with some tuning of the exact limits.

That seems like a good approach.

+#include "access/hash.h" /* XXX strange that it fails because of
BRIN_AM_OID without this */

I think you want #include "catalog/pg_am.h" here.

> Attached is a spreadsheet with benchmark results for each of those three
> approaches, on different data types (byval/byref), data set types, index
> parameters (pages/values per range) etc. I think 0007 is a reasonable
> compromise overall, with performance somewhere in betwen 0005 and 0006.
> Of course, there are cases where it's somewhat slow, e.g. for data types
> with expensive comparisons and data sets forcing frequent compactions,
> in which case it's ~10x slower compared to regular minmax (in most cases
> it's ~1.5x). Compared to btree, it's usually much faster - ~2-3x as fast
> (except for some extreme cases, of course).
>
>
> As for the opclasses for indexes without "natural" distance function,
> implemented in 0008, I think we should drop it. In theory it works, but

Sounds reasonable.

> The other thing we were considering was using the new multi-minmax
> opclasses as default ones, replacing the existing minmax ones. IMHO we
> shouldn't do that either. For existing minmax indexes that's useless
> (the opclass seems to be working, otherwise the index would be dropped).
> But even for new indexes I'm not sure it's the right thing, so I don't
> plan to change this.

Okay.

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tang, Haiying 2021-02-09 14:48:02 RE: Support tab completion for upper character inputs in psql
Previous Message torikoshia 2021-02-09 14:31:10 Re: adding wait_start column to pg_locks