Re: WIP: BRIN multi-range indexes

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2020-09-10 20:01:37
Message-ID: 20200910200137.GA25491@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-Sep-10, Tomas Vondra wrote:

> I've spent a bit of time experimenting with this. My idea was to allow
> keeping an "expanded" version of the summary somewhere. As the addValue
> function only receives BrinValues I guess one option would be to just
> add bv_mem_values field. Or do you have a better idea?

Maybe it's okay to pass the BrinMemTuple to the add_value function, and
keep something there. Or maybe that's pointless and just a new field in
BrinValues is okay.

> Of course, more would need to be done:
>
> 1) We'd need to also pass the right memory context (bt_context seems
> like the right thing, but that's not something addValue sees now).

You could use GetMemoryChunkContext() for that.

> 2) We'd also need to specify some sort of callback that serializes the
> in-memory value into bt_values. That's not something addValue can do,
> because it doesn't know whether it's the last value in the range etc. I
> guess one option would be to add yet another support proc, but I guess a
> simple callback would be enough.

Hmm.

> I've hacked together an experimental version of this to see how much
> would it help, and it reduces the duration from ~4.6s to ~3.3s. Which is
> nice, but plain minmax is ~1.1s. I suppose there's room for further
> improvements in compare_combine_ranges/reduce_combine_ranges and so on,
> but I still think there'll always be a gap compared to plain minmax.

The main reason I'm talking about desupporting plain minmax is that,
even if it's amazingly fast, it loses quite quickly in real-world cases
because of loss of correlation. Minmax's build time is pretty much
determined by speed at which you can seqscan the table. I don't think
we lose much if we add overhead in order to create an index that is 100x
more useful.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-09-10 20:29:23 Re: SIGQUIT handling, redux
Previous Message Julien Rouhaud 2020-09-10 19:27:40 Re: Collation versioning