Re: Bitmap table scan cost per page formula

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Haisheng Yuan <hyuan(at)pivotal(dot)io>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bitmap table scan cost per page formula
Date: 2017-12-21 06:05:20
Message-ID: CAMkU=1whYV+L_SJDr8Hw1a3TtFrzP9+r_Ee3SegSm_-v58VkUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 20, 2017 at 5:03 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> The parabola is probably wrong in detail --- its behavior as we approach
> reading all of the pages ought to be more asymptotic, seems like.
> I suppose that the reason it appears to go below the seqscan cost at the
> right is that even the rightmost end of the curve doesn't reflect reading
> all of the *tuples*, just one tuple per page, so that there's some CPU
> savings from not inspecting all the tuples.

I think that the graph is only about the IO costs of the heap scan, and the
CPU costs are an independent issue.

The reason it drops below the seqscan on the far right edge is much
simpler. 10,000 is the point where 100% of the pages are being read, so
the two scans converge to the same value. The graph continues above 10,000
but it is meaningless as a bitmap heap scan will never read more than 100%
of the table pages and so the math no longer represents a reality.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-12-21 06:08:11 Re: domain cast in parameterized vs. non-parameterized query
Previous Message Jeff Janes 2017-12-21 05:55:40 Re: Bitmap table scan cost per page formula