Re: Bitmap scan is undercosted?

From: Vitaliy Garnashevich <vgarnashevich(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Bitmap scan is undercosted?
Date: 2017-12-01 23:08:03
Message-ID: 2409b7fa-33dc-fa06-87b3-408774201885@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 01/12/2017 20:34, Justin Pryzby wrote:
> On Fri, Dec 01, 2017 at 07:40:08PM +0200, Vitaliy Garnashevich wrote:
>> We recently had an issue in production, where a bitmap scan was chosen
>> instead of an index scan. Despite being 30x slower, the bitmap scan had
>> about the same cost as the index scan.
> Me too, see also:
> https://www.postgresql.org/message-id/flat/CAH2-WzkRTggiy_LKQUu-oViyp6y_Hhz-a1yWacPy4tcYWV1HoA%40mail(dot)gmail(dot)com#CAH2-WzkRTggiy_LKQUu-oViyp6y_Hhz-a1yWacPy4tcYWV1HoA(at)mail(dot)gmail(dot)com
>
>> drop table if exists aaa;
>> create table aaa as select (id%100)::int num, (id%10=1)::bool flag from
>> generate_series(1, 10000000) id;
>> create index i1 on aaa  (num);
>> create index i2 on aaa  (flag);
>> analyze aaa;
>>
>> select relname, reltuples::bigint, relpages::bigint,
>> (reltuples/relpages)::bigint tpp from pg_class where relname
>> in('aaa','i1','i2') order by relname;
>> "aaa";9999985;44248;226
>> "i1";9999985;27422;365
>> "i2";9999985;27422;365
>>
>> The query was:
>> explain (analyze,verbose,costs,buffers)
>> select count(*) from aaa where num = 1 and flag = true;
> Note that id%100==1 implies flag='t', so the planner anticipates retrieving
> fewer rows than it will ultimately read, probably by 2x. It makes sense that
> causes the index scan to be more expensive than expected, but that's only
> somewhat important, since there's no joins involved.
I don't think the planner is that smart to account for correlation
between values in different columns. When different values are used in
filter (num=2, num=39, num=74), the query actually runs faster, while
still being about twice slower than using an index scan. But the cost
does not change much. It jumps up and down for different values, but
it's still close to the initial value.

Aggregate  (cost=24239.02..24239.03 rows=1 width=8) (actual
time=105.239..105.239 rows=1 loops=1)
  Output: count(*)
  Buffers: shared hit=3011
  ->  Bitmap Heap Scan on public.aaa  (cost=12812.05..24214.48
rows=9816 width=0) (actual time=105.236..105.236 rows=0 loops=1)
        Output: num, flag
        Recheck Cond: (aaa.num = 39)
        Filter: aaa.flag
        Buffers: shared hit=3011
        ->  BitmapAnd  (cost=12812.05..12812.05 rows=9816 width=0)
(actual time=105.157..105.157 rows=0 loops=1)
              Buffers: shared hit=3011
              ->  Bitmap Index Scan on i1  (cost=0.00..1134.94
rows=97667 width=0) (actual time=15.725..15.725 rows=100000 loops=1)
                    Index Cond: (aaa.num = 39)
                    Buffers: shared hit=276
              ->  Bitmap Index Scan on i2  (cost=0.00..11671.96
rows=1005003 width=0) (actual time=77.920..77.920 rows=1000000 loops=1)
                    Index Cond: (aaa.flag = true)
                    Buffers: shared hit=2735
Planning time: 0.104 ms
Execution time: 105.553 ms

Aggregate  (cost=65785.99..65786.00 rows=1 width=8) (actual
time=48.587..48.587 rows=1 loops=1)
  Output: count(*)
  Buffers: shared hit=44524
  ->  Index Scan using i1 on public.aaa  (cost=0.44..65761.45 rows=9816
width=0) (actual time=48.583..48.583 rows=0 loops=1)
        Output: num, flag
        Index Cond: (aaa.num = 39)
        Filter: aaa.flag
        Rows Removed by Filter: 100000
        Buffers: shared hit=44524
Planning time: 0.097 ms
Execution time: 48.620 ms

>
> The reason why it's more than a bit slower is due to the "density" [0] of the
> heap pages read. num=1 is more selective than flag=true, so it scans i1,
> reading 1% of the whole table. But it's not reading the first 1% or
> some other 1% of the table, it reads tuples evenly distributed across the
> entire table (226*0.01 = ~2 rows of each page). Since the index was created
> after the INSERT, the repeated keys (logical value: id%100) are read in
> physical order on the heap, so this is basically doing a seq scan, but with the
> additional overhead of reading the index, and maybe doing an fseek() before
> each/some read()s, too. You could confirm that by connecting strace to the
> backend before starting the query.
>
> Since you did that using % and with indices added after the INSERT, you can't
> improve it by reindexing (as I was able to for my case). That's an elegant
> test case, so thanks.
>
> I think shared_buffers=512MB is just small enough for this test to be painful
> for 1e7 rows. I see the table+index is 559MB.
           table           | ~count  |    size    |   toast |  idx   |
size + toast + idx
---------------------------+---------+------------+------------+--------+--------------------
 aaa                       | 9999994 | 346 MB     | 0 bytes    | 428 MB
| 774 MB

But the plan says all buffers are "shared hit", and none "read", so
that's probably not an issue.

>
> I don't know if that's really similar to your production use case, but I would
> recommend trying BRIN indices, which always require a bitmap scan. Note that
> some things (like max()) that can use an btree index cannot use brin. PG10.1
> has WITH autosummarize, which was important for our use, since we rarely do
> UPDATEs or DELETEs so tables are rarely vacuumed (only analyzed).
Yes, BRIN indexes should be beneficial for our case, because there is a
date column which grows over time (not strictly regularly, but still).
Unfortunately, we're still migrating our databases from 9.3 to 9.6.
Anyway, thanks for the advice.
>
> Justin
>
> [0] I'm borrowing Jeff's language from here:
> https://www.postgresql.org/message-id/CAMkU%3D1xwGn%2BO0jhKsvrUrbW9MQp1YX0iB4Y-6h1mEz0ffBxK-Q%40mail.gmail.com
> "density" wasn't our problem, but it's a perfect description of this issue.
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2017-12-01 23:11:05 Re: Bitmap scan is undercosted?
Previous Message Fabien COELHO 2017-12-01 22:23:37 Re: [HACKERS] pow support for pgbench

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2017-12-01 23:11:05 Re: Bitmap scan is undercosted?
Previous Message Tom Lane 2017-12-01 21:33:48 Re: Bad plan for ltree predicate <@