Re: Bitmap scan is undercosted?

From: Vitaliy Garnashevich <vgarnashevich(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Bitmap scan is undercosted?
Date: 2017-12-02 07:08:38
Message-ID: f8edc163-7ebe-cdb8-d068-685355045f24@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 02/12/2017 07:51, Jeff Janes wrote:
> On Fri, Dec 1, 2017 at 3:54 PM, Vitaliy Garnashevich
> <vgarnashevich(at)gmail(dot)com <mailto:vgarnashevich(at)gmail(dot)com>> wrote:
>
> On 02/12/2017 01:11, Justin Pryzby wrote:
>
> I tried to reproduce this issue and couldn't, under PG95 and 10.1:
>
> On Fri, Dec 01, 2017 at 12:34:27PM -0600, Justin Pryzby wrote:
>
> On Fri, Dec 01, 2017 at 07:40:08PM +0200, Vitaliy
> Garnashevich wrote:
>
> We recently had an issue in production, where a bitmap
> scan was chosen
> instead of an index scan. Despite being 30x slower,
> the bitmap scan had
> about the same cost as the index scan.
> drop table if exists aaa;
> create table aaa as select (id%100)::int num,
> (id%10=1)::bool flag from
> generate_series(1, 10000000) id;
> create index i1 on aaa  (num);
> create index i2 on aaa  (flag);
> analyze aaa;
>
> What is:
> effective_io_concurrency
> max_parallel_workers_per_gather (I gather you don't have this)
>
> effective_io_concurrency = 0
> max_parallel_workers_per_gather = 0
>
> Did you notice random_page_cost = 1.5 ?
>
>
> For the aaa.num = 39 case, the faster index scan actually does hit 15
> times more buffers than the bitmap scan does.  While 1.5 is lot lower
> than 4.0, it is still much higher than the true cost of reading a page
> from the buffer cache.   This why the index scan is getting punished. 
> You could lower random_page_cost and seq_page_cost to 0, to remove
> those considerations.  (I'm not saying you should do this on your
> production system, but rather you should do it as a way to investigate
> the issue.  But it might make sense on production as well)
seq_page_cost = 1.0
random_page_cost = 1.0*
*explain analyze select * from aaa where num = 2 and flag = true;

Bitmap Heap Scan on aaa  (cost=11536.74..20856.96 rows=10257 width=5)
(actual time=108.338..108.338 rows=0 loops=1)
  ->  BitmapAnd  (cost=11536.74..11536.74 rows=10257 width=0) (actual
time=108.226..108.226 rows=0 loops=1)
        ->  Bitmap Index Scan on i1  (cost=0.00..1025.43 rows=100000
width=0) (actual time=18.563..18.563 rows=100000 loops=1)
        ->  Bitmap Index Scan on i2  (cost=0.00..10505.93 rows=1025666
width=0) (actual time=78.493..78.493 rows=1000000 loops=1)

Index Scan using i1 on aaa  (cost=0.44..44663.58 rows=10257 width=5)
(actual time=51.264..51.264 rows=0 loops=1)

Here I've used the filter num = 2, which produces rows=0 at BitmapAnd,
and thus avoids a lot of work at "Bitmap Heap Scan" node, while still
leaving about the same proportion in bitmap vs index - the bitmap is
twice slower but twice less costly. It does not matter much which value
to use for the filter, if it's other than num = 1.

seq_page_cost = 0.0
random_page_cost = 0.0
explain analyze select * from aaa where num = 2 and flag = true;

Bitmap Heap Scan on aaa  (cost=753.00..2003.00 rows=10257 width=5)
(actual time=82.212..82.212 rows=0 loops=1)
  ->  Bitmap Index Scan on i1  (cost=0.00..750.43 rows=100000 width=0)
(actual time=17.401..17.401 rows=100000 loops=1)

Index Scan using i1 on aaa  (cost=0.44..1750.43 rows=10257 width=5)
(actual time=49.766..49.766 rows=0 loops=1)

The bitmap plan was reduced to use only one bitmap scan, and finally it
costs more than the index plan. But I doubt that the settings
seq_page_cost = random_page_cost = 0.0 should actually be used. Probably
it should be instead something like 1.0/1.0 or 1.0/1.1, but other costs
increased, to have more weight.

# x4 tuple/operator costs - bitmap scan still a bit cheaper
set seq_page_cost = 1.0;
set random_page_cost = 1.0;
set cpu_tuple_cost = 0.04;
set cpu_index_tuple_cost = 0.02;
set cpu_operator_cost = 0.01;

Bitmap Heap Scan on aaa  (cost=36882.97..46587.82 rows=10257 width=5)
(actual time=106.045..106.045 rows=0 loops=1)
  ->  BitmapAnd  (cost=36882.97..36882.97 rows=10257 width=0) (actual
time=105.966..105.966 rows=0 loops=1)
        ->  Bitmap Index Scan on i1  (cost=0.00..3276.74 rows=100000
width=0) (actual time=15.977..15.977 rows=100000 loops=1)
        ->  Bitmap Index Scan on i2  (cost=0.00..33584.72 rows=1025666
width=0) (actual time=79.208..79.208 rows=1000000 loops=1)

Index Scan using i1 on aaa  (cost=1.74..49914.89 rows=10257 width=5)
(actual time=50.144..50.144 rows=0 loops=1)

# x5 tuple/operator costs - switched to single bitmap index scan, but
now it costs more than the index scan
set seq_page_cost = 1.0;
set random_page_cost = 1.0;
set cpu_tuple_cost = 0.05;
set cpu_index_tuple_cost = 0.025;
set cpu_operator_cost = 0.0125;

Bitmap Heap Scan on aaa  (cost=4040.00..54538.00 rows=10257 width=5)
(actual time=82.338..82.338 rows=0 loops=1)
  ->  Bitmap Index Scan on i1  (cost=0.00..4027.18 rows=100000 width=0)
(actual time=19.541..19.541 rows=100000 loops=1)

Index Scan using i1 on aaa  (cost=2.17..51665.32 rows=10257 width=5)
(actual time=49.545..49.545 rows=0 loops=1)

I've also tried seq_page_cost = 1.0, random_page_cost = 1.1, but that
would require more than x10 increase in tuple/operator costs, to make
bitmap more costly than index.

>
>
> For this test I'm using SSD and Windows (if that matters). On
> production we also use SSD, hence lower random_page_cost. But with
> the default random_page_cost=4.0, the difference in cost between
> the index scan plan and the bitmap scan plan is even bigger.
>
>
> Since it is all shared buffers hits, it doesn't matter if you have SSD
> for this particular test case.
Agree. I've just tried to justify the value of random_page_cost, which
is lower than like 2.0.

> Cheers,
>
> Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Beena Emerson 2017-12-02 08:33:22 Re: [HACKERS] Runtime Partition Pruning
Previous Message Justin Pryzby 2017-12-02 06:41:13 Re: Bitmap scan is undercosted?

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2017-12-02 21:17:47 Re: Bitmap scan is undercosted?
Previous Message Justin Pryzby 2017-12-02 06:41:13 Re: Bitmap scan is undercosted?