BUG #15481: possible not using dependencies statistics when estimate row count for bitmap index scan node

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: alexey(dot)ermakov(at)dataegret(dot)com
Subject: BUG #15481: possible not using dependencies statistics when estimate row count for bitmap index scan node
Date: 2018-11-01 10:18:58
Message-ID: 15481-b4f5350da9baa975@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15481
Logged by: Alexey Ermakov
Email address: alexey(dot)ermakov(at)dataegret(dot)com
PostgreSQL version: 10.5
Operating system: Debian 8
Description:

Hello,

I found a case when postgres choose a bad plan perhaps because it doesn't
use dependencies statistics for bitmap index scan.

Here is simple case to show the problem:

=============================
--drop table test_statistics;
postgres=# create table test_statistics as select id, id % 100 as a, id %
100 as b from generate_series(1, 50000) gs(id);
SELECT 50000
postgres=# create index concurrently on test_statistics using btree(id)
where a = 1 and b = 1;
CREATE INDEX
postgres=# analyze test_statistics;
ANALYZE
postgres=# set enable_indexscan = off;
SET
postgres=# explain analyze select * from test_statistics where a = 1 and b =
1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test_statistics (cost=0.40..0.90 rows=4 width=12)
(actual time=0.158..0.588 rows=500 loops=1)
Recheck Cond: ((a = 1) AND (b = 1))
Heap Blocks: exact=270
-> Bitmap Index Scan on test_statistics_id_idx (cost=0.00..0.40 rows=4
width=0) (actual time=0.097..0.097 rows=500 loops=1)
Planning time: 0.590 ms
Execution time: 0.779 ms
(6 rows)

postgres=# create statistics test_statistics_a_b (dependencies) on a, b from
test_statistics;
CREATE STATISTICS
postgres=# analyze test_statistics;
ANALYZE
postgres=# explain analyze select * from test_statistics where a = 1 and b =
1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test_statistics (cost=0.53..1.15 rows=490 width=12)
(actual time=0.093..0.375 rows=500 loops=1)
Recheck Cond: ((a = 1) AND (b = 1))
Heap Blocks: exact=270
-> Bitmap Index Scan on test_statistics_id_idx (cost=0.00..0.41 rows=5
width=0) (actual time=0.061..0.061 rows=500 loops=1)
Planning time: 0.294 ms
Execution time: 0.430 ms
(6 rows)
=============================

Please note that in last plan Bitmap Heap Scan and Bitmap Index Scan nodes
have different estimates for row count. I think Bitmap Heap Scan node uses

dependencies statistics but Bitmap Index Scan doesn't. Is it expected
behavior?

Real example (anonymized):
postgres choose this plan
-> Bitmap Heap Scan on t (cost=10590.66..11080.89 rows=16340 width=16)
(actual time=2918.297..3022.014 rows=6166 loops=1)
Recheck Cond: (...)
Rows Removed by Index Recheck: 42768
Filter: (...)
Rows Removed by Filter: 181
Heap Blocks: exact=43442
-> BitmapAnd (cost=10590.66..10590.66 rows=2887 width=0) (actual
time=2905.379..2905.379 rows=0 loops=1)
-> Bitmap Index Scan on i1 (cost=0.00..995.32 rows=125916
width=0) (actual time=16.652..16.652 rows=114469 loops=1)
Index Cond: (published_at > '2018-10-31
00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on i2 (cost=0.00..9586.92 rows=1748888
width=0) (actual time=2833.954..2833.954 rows=10969983 loops=1)

instead of this
-> Index Scan Backward using i1 on t (cost=0.57..18161.40 rows=16340
width=16) (actual time=1.228..195.337 rows=6166 loops=1)
Index Cond: (published_at > '2018-10-31 00:00:00'::timestamp without
time zone)
Filter: (...)
Rows Removed by Filter: 108266

because underestimated number of rows for Bitmap Index Scan on i2 (partial
index, for columns in index predicate there is dependencies statistics).

Thanks,
Alexey Ermakov

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-11-01 14:26:54 Re: BUG #15480: psql 11 array concatenation in CASE takes on values from the CASE expression when using enum_range
Previous Message Wèi Cōngruì 2018-11-01 09:30:17 Re: BUG #15478: 配置文件 pg_hba.conf 异常