Re: Bitmap table scan cost per page formula

From: Haisheng Yuan <hyuan(at)pivotal(dot)io>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)pivotal(dot)io>, Query Processing - All <qp-all(at)pivotal(dot)io>
Subject: Re: Bitmap table scan cost per page formula
Date: 2017-12-20 23:13:10
Message-ID: CAPW_87HEVvJ10dcUSJuuqMZy5P739w3-yqSw-Eqk2jqZ-pQcYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert, you are right. The new formula serves Greenplum better than the
original formula, because our default random page cost is much higher than
Postgres. We don't want random cost always dominates in the final cost per
page.

~ ~ ~
Haisheng Yuan

On Wed, Dec 20, 2017 at 12:25 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Dec 19, 2017 at 10:25 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com>
> wrote:
> > In this old thread: https://www.postgresql.org/
> message-id/CAGTBQpZ%2BauG%2BKhcLghvTecm4-cGGgL8vZb5uA3%
> 3D47K7kf9RgJw%40mail.gmail.com
> > ..Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> >> Correct me if I'm wrong, but this looks like the planner not
> >> accounting for correlation when using bitmap heap scans.
> >>
> >> Checking the source, it really doesn't.
> >
> > ..which I think is basically right: the formula does distinguish between
> the
> > cases of small or large fraction of pages, but doesn't use correlation.
> Our
> > issue in that case seems to be mostly a failure of cost_index to account
> for
> > fine-scale deviations from large-scale correlation; but, if cost_bitmap
> > accounted for our high correlation metric (>0.99), it might've helped
> our case.
>
> I think this is a different and much harder problem than the one
> Haisheng Yuan is attempting to fix. His data shows that the cost
> curve has a nonsensical shape even when the assumption that pages are
> spread uniformly is correct. That should surely be fixed. Now, being
> able to figure out whether the assumption of uniform spread is correct
> on a particular table would be nice too, but it seems like a much
> harder problem.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-12-21 00:38:55 Re: Basebackups reported as idle
Previous Message Tom Lane 2017-12-20 23:06:15 Re: Letting plpgsql in on the fun with the new expression eval stuff