Re: Bitmap scan is undercosted?

From: Vitaliy Garnashevich <vgarnashevich(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Bitmap scan is undercosted?
Date: 2017-12-03 21:15:01
Message-ID: 501bd53b-bc66-bf58-7590-939dc5cba170@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 02/12/2017 23:17, Jeff Janes wrote:
> Right, so there is a cpu costing problem (which could only be fixed by
> hacking postgresql and recompiling it), but it is much smaller of a
> problem than the IO cost not being accurate due to the high hit rate.
> Fixing the CPU costing problem is unlikely to make a difference to
> your real query.  If you set the page costs to zero, what happens to
> your real query?
I can't reproduce the exact issue on the real database any more. The
query started to use the slow bitmap scan recently, and had been doing
so for some time lately, but now it's switched back to use the index
scan. The table involved in the query gets modified a lot. It has
hundreds of millions of rows. Lots of new rows are appended to it every
day, the oldest rows are sometimes removed. The table is analyzed at
least daily. It's possible that statistics was updated and that caused
the query to run differently. But I still would like to understand why
that issue happened, and how to properly fix it, in case the issue returns.
>
> But I doubt that the settings seq_page_cost = random_page_cost =
> 0.0 should actually be used.
>
>
> Why not?  If your production server really has everything in memory
> during normal operation, that is the correct course of action.  If you
> ever restart the server, then you could have some unpleasant time
> getting it back up to speed again, but pg_prewarm could help with that.
In the real database, not everything is in memory. There are 200GB+ of
RAM, but DB is 500GB+. The table involved in the query itself is 60GB+
of data and 100GB+ of indexes. I'm running the test case in a way where
all reads are done from RAM, only to make it easier to reproduce and to
avoid unrelated effects.

As far as know, costs in Postgres were designed to be relative to
seq_page_cost, which for that reason is usually defined as 1.0. Even if
everything would be in RAM, accesses to the pages would still not have
zero cost. Setting 0.0 just seems too extreme, as all other non-zero
costs would become infinitely bigger.
> If you really want to target the plan with the BitmapAnd, you should
> increase cpu_index_tuple_cost and/or cpu_operator_cost but not
> increase cpu_tuple_cost.  That is because the  unselective bitmap
> index scan does not incur any cpu_tuple_cost, but does incur
> index_tuple and operator costs.  Unfortunately all other index scans
> in the system will also be skewed by such a change if you make the
> change system-wide.
Exactly. I'd like to understand why the worse plan is being chosen, and
1) if it's fixable by tuning costs, to figure out the right settings
which could be used in production, 2) if there is a bug in Postgres
optimizer, then to bring some attention to it, so that it's eventually
fixed in one of future releases, 3) if Postgres is supposed to work this
way, then at least I (and people who ever read this thread) would
understand it better.

Regards,
Vitaliy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vitaliy Garnashevich 2017-12-03 21:22:52 Re: Bitmap scan is undercosted?
Previous Message Stephen Frost 2017-12-03 20:42:16 Re: [HACKERS] postgres_fdw super user checks

Browse pgsql-performance by date

  From Date Subject
Next Message Vitaliy Garnashevich 2017-12-03 21:22:52 Re: Bitmap scan is undercosted?
Previous Message Justin Pryzby 2017-12-03 04:04:30 Re: Bitmap scan is undercosted? - boolean correlation