Re: WIP: BRIN multi-range indexes

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2020-09-09 19:35:30
Message-ID: 20200909193530.x57p5nzawryndxzj@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 09, 2020 at 03:40:41PM -0300, Alvaro Herrera wrote:
>On 2020-Sep-09, John Naylor wrote:
>
>> create index on t using brin (a);
>> CREATE INDEX
>> Time: 1631.452 ms (00:01.631)
>
>> create index on t using brin (a int8_minmax_multi_ops);
>> CREATE INDEX
>> Time: 6521.026 ms (00:06.521)
>
>It seems strange that the multi-minmax index takes so much longer to
>build. I wonder if there's some obvious part of the algorithm that can
>be improved?
>

There are some minor optimizations possible - for example I noticed we
call minmax_multi_get_strategy_procinfo often because it happens in a
loop, and we could easily do it just once. But that saves only about 10%
or so, it's not a ground-breaking optimization.

The main reason for the slowness is that we pass the values one by one
to brin_minmax_multi_add_value - and on each call we need to deserialize
(and then sometimes also serialize) the summary, which may be quite
expensive. The regular minmax does not have this issue, it just swaps
the Datum value and that's it.

I see two possible optimizations - firstly, adding some sort of batch
variant of the add_value function, which would get a bunch of values
instead of just a single one, amortizing the serialization costs.

Another option would be to teach add_value to keep the deserialized
summary somewhere, and then force serialization at the end of the BRIN
page range. The end result would be roughly the same, I think.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-09-09 19:53:30 Re: WIP: BRIN multi-range indexes
Previous Message Andres Freund 2020-09-09 19:26:20 Re: SIGQUIT handling, redux