Re: Cost of sort/order by not estimated by the query planner

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Laurent Laborde <kerdezixe(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Cost of sort/order by not estimated by the query planner
Date: 2009-12-02 17:01:24
Message-ID: 15107.1259773284@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> The exact break-even point between the two plans will vary depending
> on what percentage of the rows in the table satisfy the bitmap
> condition.

It's worse than that. The planner is not too bad about understanding
the percentage-of-rows problem --- at least, assuming you are using
a condition it has statistics for, which it doesn't for bitvector &&.
But whether the indexscan plan is fast will also depend on where the
matching rows are in the index ordering. If they're all towards the
end you can lose big, and the planner hasn't got stats to let it
predict that. It just assumes the filter condition is uncorrelated
to the ordering condition.

My own advice would be to forget the bitmap field and see if you can't
use a collection of plain boolean columns instead. You might still
lose if there's a correlation problem, but "bitfield && B'1'" is
absolutely positively guaranteed to produce stupid row estimates and
hence bad plan choices.

Or you could work on introducing a non-stupid selectivity estimator
for &&, but it's not a trivial project.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Sabino Mullane 2009-12-02 17:22:47 Re: [CORE] EOL for 7.4?
Previous Message Robert Haas 2009-12-02 16:49:20 Re: Page-level version upgrade (was: Block-level CRC checks)

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2009-12-02 18:47:27 Re: Order by (for 15 rows) adds 30 seconds to query time
Previous Message Robert Haas 2009-12-02 16:47:44 Re: Cost of sort/order by not estimated by the query planner