Re: WIP: BRIN multi-range indexes

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP: BRIN multi-range indexes
Date: 2021-03-11 16:26:06
Message-ID: 71dfd257-c5ce-530d-90d3-7c744a28e97b@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Here is an updated version of the patch series.

It fixes the assert failure (just remove the multiplication from it) and
adds a simple regression test that exercises this.

Based on the discussion so far, I've decided to keep just the new
signature of the consistent function. That's a bit simpler than having
to support both 3 and 4 parameters, and it would not deal with the NULL
changes anyway (mostly harmless code duplication, but still). I've also
realized the API documentation in SGML needs updating.

At this point, I think 0001-0006 parts are mostly committable.

As for the remaining two parts, the one dealing with correlation may not
be strictly necessary, but not having it (or something like it) may
result in not picking the BRIN index in some cases.

But maybe it's not a major problem. I tried the example from [1] but it
no longer triggers the issue for me - I'm not entirely sure why, but the
costing changed for some reason. It used to look like this:

QUERY PLAN
------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=5553.07..174800.78 rows=10 width=4)
(actual time=7.790..27.585 rows=10 loops=1)
Recheck Cond: (a = 10000)
Rows Removed by Index Recheck: 224182
Heap Blocks: lossy=992
-> Bitmap Index Scan on t_a_idx
(cost=0.00..5553.06 rows=9999977 width=0)
(actual time=7.006..7.007 rows=9920 loops=1)
Index Cond: (a = 10000)
Planning Time: 0.052 ms
Execution Time: 27.658 ms
(8 rows)

but now I'm getting this:

QUERY PLAN
------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=5546.22..97469.16 rows=10 width=4)
(actual time=5.264..32.051 rows=10 loops=1)
Recheck Cond: (a = 10000)
Rows Removed by Index Recheck: 314582
Heap Blocks: lossy=1392
-> Bitmap Index Scan on t_a_idx
(cost=0.00..5546.22 rows=3813995 width=0)
(actual time=2.534..2.536 rows=13920 loops=1)
Index Cond: (a = 10000)
Planning Time: 0.052 ms
Execution Time: 32.095 ms
(8 rows)

The index scan cost is about the same, but the heap scan is about half
the cost. The row estimates are a bit different, perhaps that's related.
The seqscan cost (169248) and duration (~500ms) is still about the same,
but so now we still pick the bitmap heap scan. Not sure we can rely on
this, though. Would be quite sad if we had new improved opclasses but
the planner often decided not to use them.

While playing with this, I discovered another weirdness - consider this
example (a more complete reproducer.sql script attached). If you try it
without the 0008 part, you'll get this:

init
----

create table t (a int, b bigint, c text) with (fillfactor = 50);

insert into t select
1000000 * random(),
1000000 * random(),
md5((1000000 * random())::int::text)
from generate_series(1,1000000) s(i);

create index bloom_a_idx on t using brin
(a int4_bloom_ops(n_distinct_per_range=6000));

analyze t;

query
-----

explain select * from t where a = 1000;

QUERY PLAN
-----------------------------------------------------------------
Bitmap Heap Scan on t (cost=660.03..14000.00 rows=2 width=45)
Recheck Cond: (a = 1000)
-> Bitmap Index Scan on bloom_a_idx
(cost=0.00..660.03 rows=6135 width=0)
Index Cond: (a = 1000)
(4 rows)

So far so good - we're using the right index.

create another index (regular minmax on random data)
----------------------------------------------------

create index brin_idx on t using brin (a);

analyze t;

query (again)
-------------

explain select * from t where a = 1000;

QUERY PLAN
------------------------------------------------------
Seq Scan on t (cost=0.00..33334.00 rows=2 width=45)
Filter: (a = 1000)
(2 rows)

Umm, what? We've created an index, we don't pick it, but it somehow
blocks us from using the right index? So we end up with a sequential
scan, which is much more expensive (both in terms of cost and duration)
than using the bloom index.

Turns out this is due to how choose_bitmap_and eliminates index paths
with the same set of clauses - it simply compares the cost for reading
the index itself. And the bloom filter is larger than plain minmax,
hence it loses. But this entirely ignores the cost increase at the heap
scan level - the minmax index is absulutely terrible, and it loses not
just to bloom index (which we've discarded already) but also to seqscan.

I had an idea of tweaking choose_bitmap_and to consider both the cost
and selectivity (similarly to how add_path considers statup/total cost),
and that did indeed resolve this particular case. This is what the 0008
part does.

But it may also have negative consequence, as demonstrated by the
reproducer2.sql script. So maybe the logic would need to be more
complicated. Or maybe there's no reliable solution, considering how
tricky/unreliable BRIN estimates are.

That being said, I don't think this is something we need to solve here,
and it may not actually be an issue at all. For this to happen there
need to be a terrible index on the same attribute (like the minmax index
in the example above). But why keeping such index anyway? Dropping it
would make the issue go away. If we have two indexes that both perform
reasonably (say, bloom and minmax-multi), the consequences are not that
bad. so this is interesting, but probably fine.

[1]
https://www.postgresql.org/message-id/20200807162701.v7lyzun32lqku3md%40development

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-introduce-bsearch_arg-20210311.patch text/x-patch 4.8 KB
0002-Pass-all-scan-keys-to-BRIN-consistent-funct-20210311.patch text/x-patch 23.5 KB
0003-Move-IS-NOT-NULL-handling-from-BRIN-support-20210311.patch text/x-patch 23.1 KB
0004-Optimize-allocations-in-bringetbitmap-20210311.patch text/x-patch 4.5 KB
0005-BRIN-bloom-indexes-20210311.patch text/x-patch 128.7 KB
0006-BRIN-minmax-multi-indexes-20210311.patch text/x-patch 241.8 KB
0007-Ignore-correlation-for-new-BRIN-opclasses-20210311.patch text/x-patch 4.2 KB
0008-Consider-selectivity-in-choose_bitmap_and-20210311.patch text/x-patch 2.0 KB
reproduce2.sql application/sql 427 bytes
reproducer.sql application/sql 653 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-03-11 16:26:51 Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY
Previous Message Robert Haas 2021-03-11 16:09:52 Re: pg_amcheck contrib application