Re: Extra cost of "lossy mode" Bitmap Scan plan

From: higepon <higepon(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Extra cost of "lossy mode" Bitmap Scan plan
Date: 2009-05-07 06:49:52
Message-ID: f07386410905062349g477466feja6f95687823a093f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi.
I apologize for this late reply.

Tom Lane wrote:
> I think it's probably useless. In the first place, at reasonable values
> of work_mem the effect is going to be negligible (in the sense that a
> plain indexscan would never win). In the second place, there isn't any
> way to guess the extent of lossiness at plan time --- it depends on how
> much the target rows are "clumped" on particular pages. The planner
> hasn't got any stats that would let it guess that, and even if we tried
> to collect such stats they'd probably be too unstable to be useful.

Thank you. I understand your two points.

But in this context, I can't understand why Bitmap Scan is faster than
Index Scan.

If there are many where conditions like following,
I can understand why Bitmap Scan is faster than Index Scan.
select * from emp where emp_no > 10000 and sex = 1 and salary > 5000;
Bitmap Scan merges each result bitmaps and after the merge read tuples.

But I don't understand this case.
select * from emp where emp_no > 10000;
Is Bitmap Scan is faster than Index Scan in this case ?
If so, why ?

My understanding that both Index Scan and Bitmap Scan scans index on
emp_no and they are almost same cost.
In addition Bitmap Scan creates a bitmap. So Bitmap Scan is a little bit slower?

Best regards,

-----
MINOWA Taro (Higepon)

Cybozu Labs, Inc.

http://www.monaos.org/
http://code.google.com/p/mosh-scheme/

On Tue, Apr 28, 2009 at 11:36 PM, Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> On Tue, Apr 28, 2009 at 3:02 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> You may be right, but on the other hand, I'm not sure there's any
>> sense in NOT trying to model the impact of the additional heap
>> fetches.
>
> Yeah, the flip side of the argument is that we generally try to do the
> best job we can modeling costs and let the arithmetic work out how it
> however it does because you never know what kind of wacky situations
> will arise planning queries and and the better the estimates the
> better your chance of coming up with a good plan.
>
> For example the planner may have other join orders which allow it to
> avoid accessing those records entirely. So the comparison with a
> nested loop might not be the only comparison that matters. It might be
> a case of whether to run a bitmap scan against this table or some scan
> against another table to drive the join.
>
> I have been running benchmarks comparing bitmap heap scans against
> index scans amongst other comparisons. I haven't done CVS head yet but
> on an older version I'm seeing with effective_io_concurrency set to 0
> scanning 1000 random tuples throughout a 130G table (one searched
> tuple per page) on a machine with 64G of ram after repeated executions
> index scans settle down to about 245s vs 205s for bitmap scans (for
> 100 iterations). So they're about 16% faster for this use case.
>
> Incidentally with effective_io_concurrency set to 30 on this 30-drive
> raid the bitmap scans go down to 17s :)
>
> --
> greg
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-05-07 07:11:09 Re: WIP patch for TODO Item: Add prompt escape to display the client and server versions
Previous Message Tom Lane 2009-05-07 05:33:06 Re: xml2 in 8.4 still alive?