partial indexes and bitmap scans

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: partial indexes and bitmap scans
Date: 2017-03-09 16:01:21
Message-ID: 20170309160121.GH9812@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

Consider this:

create table t10 (c1 int, c2 int);
create index on t10 (c1) where c2 > 5;

\d t10
Table "sfrost.t10"
Column | Type | Modifiers
--------+---------+-----------
c1 | integer |
c2 | integer |
Indexes:
"t10_c1_idx" btree (c1) WHERE c2 > 5

insert into t10 select * from generate_series(1,10000) a, generate_series(1,10) b;
(repeat a bunch of times, if desired)

vacuum analyze t10;
set work_mem = '64kB';
set enable_indexscan = false;
set enable_seqscan = false;

=*> explain analyze select * from t10 where c2 > 6;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t10 (cost=6496.49..15037.50 rows=318653 width=8) (actual time=34.682..116.236 rows=320000 loops=1)
Recheck Cond: (c2 > 5)
Rows Removed by Index Recheck: 327502
Filter: (c2 > 6)
Rows Removed by Filter: 80000
Heap Blocks: exact=642 lossy=2898
-> Bitmap Index Scan on t10_c1_idx (cost=0.00..6416.83 rows=400081 width=0) (actual time=34.601..34.601 rows=400000 loops=1)
Planning time: 0.087 ms
Execution time: 124.229 ms
(9 rows)

Perhaps I'm missing something obvious, but isn't it a bit redundant to
have both a Recheck condition (which is the predicate of the index) and
a Filter condition (which is the user's predicate) when we've already
decided that the user's predicate must result in a subset of the
index's, as, otherwise, we wouldn't be able to use the index in the
first place?

In other words, it seems like we shouldn't need a Filter in the above
Bitmap Heap Scan, instead we should just make the Recheck be (c2 > 6).

I've not looked into the code side of this at all and there may be
reasons why this is hard to do, but it seems like a worthwhile
improvement to consider doing, though perhaps I'm missing some reason
why we need both the Recheck and the Filter in such cases for
correctness.

Thanks!

Stephen

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2017-03-09 16:02:53 Re: Declarative partitioning optimization for large amount of partitions
Previous Message Peter Eisentraut 2017-03-09 15:59:02 Re: use SQL standard error code for nextval