Quick Links

Re: [PERFORM] Slow query: bitmap scan troubles

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [PERFORM] Slow query: bitmap scan troubles
Date:	2013-01-07 17:35:51
Message-ID:	12791.1357580151@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

I wrote:
> Now, if we consider only CPU costs of index descent, we expect
> about log2(P) comparisons to be needed on each of the H upper pages
> to be descended through, that is we have total descent cost
> cpu_index_tuple_cost * H * log2(P)
> If we ignore the ceil() step as being a second-order correction, this
> can be simplified to
> cpu_index_tuple_cost * H * log2(N)/(H+1)

I thought some more about this and concluded that the above reasoning is
incorrect, because it ignores the fact that initial positioning on the
index leaf page requires another log2(P) comparisons (to locate the
first matching tuple if any). If you include those comparisons then the
H/(H+1) factor drops out and you are left with just "cost * log2(N)",
independently of the tree height.

But all is not lost for including some representation of the physical
index size into this calculation, because it seems plausible to consider
that there is some per-page cost for descending through the upper pages.
It's not nearly as much as random_page_cost, if the pages are cached,
but we don't have to suppose it's zero. So that reasoning leads to a
formula like
cost-per-tuple * log2(N) + cost-per-page * (H+1)
which is better than the above proposal anyway because we can now
twiddle the two cost factors separately rather than being tied to a
fixed idea of how much a larger H hurts.

As for the specific costs to use, I'm now thinking that the
cost-per-tuple should be just cpu_operator_cost (0.0025) not
cpu_index_tuple_cost (0.005). The latter is meant to model costs such
as reporting a TID back out of the index AM to the executor, which is
not what we're doing at an upper index entry. I also propose setting
the per-page cost to some multiple of cpu_operator_cost, since it's
meant to represent a CPU cost not an I/O cost.

There is already a charge of 100 times cpu_operator_cost in
genericcostestimate to model "general costs of starting an indexscan".
I suggest that we should consider half of that to be actual fixed
overhead and half of it to be per-page cost for the first page, then
add another 50 times cpu_operator_cost for each page descended through.
That gives a formula of

cpu_operator_cost * log2(N) + cpu_operator_cost * 50 * (H+2)

This would lead to the behavior depicted in the attached plot, wherein
I've modified the comparison lines (historical, 9.2, and HEAD behaviors)
to include the existing 100 * cpu_operator_cost startup cost charge in
addition to the fudge factor we've been discussing so far. The new
proposed curve is a bit above the historical curve for indexes with
250-5000 tuples, but the value is still quite small there, so I'm not
too worried about that. The people who've been complaining about 9.2's
behavior have indexes much larger than that.

Thoughts?

regards, tom lane

Attachment	Content-Type	Size
	image/png	4.1 KB
unknown_filename	text/plain	486 bytes

In response to

Re: [PERFORM] Slow query: bitmap scan troubles at 2013-01-06 23:03:05 from Tom Lane

Responses

Re: [PERFORM] Slow query: bitmap scan troubles at 2013-01-07 18:03:37 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2013-01-07 18:03:37	Re: [PERFORM] Slow query: bitmap scan troubles
Previous Message	Robert Haas	2013-01-07 16:57:27	Re: ALTER .. OWNER TO error mislabels schema as other object type

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Simon Riggs	2013-01-07 18:03:37	Re: [PERFORM] Slow query: bitmap scan troubles
Previous Message	Patrick Dung	2013-01-07 17:18:02	Re: Sub optimal performance with default setting of Postgresql with FreeBSD 9.1 on ZFS