Re: Proposal: Improve bitmap costing for lossy pages

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Improve bitmap costing for lossy pages
Date: 2017-10-06 15:51:00
Message-ID: CAFiTN-sCOVOmfNp5Drjvh-Er5Qt5EG0-6h=rwk2+nvd3mMh1JA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 6, 2017 at 7:24 PM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> On Fri, Oct 6, 2017 at 6:08 PM, Alexander Kuzmenkov
> <a(dot)kuzmenkov(at)postgrespro(dot)ru> wrote:
>>
>>> Analysis: The estimated value of the lossy_pages is way higher than
>>> its actual value and reason is that the total_pages calculated by the
>>> "Mackert and Lohman formula" is not correct.
>>
>>
>> I think the problem might be that the total_pages includes cache effects and
>> rescans. For bitmap entries we should use something like relation pages *
>> selectivity.
>
> I have noticed that for the TPCH case if I use "pages * selectivity"
> it give me better results, but IMHO directly multiplying the pages
> with selectivity may not be the correct way to calculate the number of
> heap pages it can only give the correct result when all the TID being
> fetched are clustered. But on the other hand "Mackert and Lohman
> formula" formulae consider that all the TID's are evenly distributed
> across the heap pages which can also give the wrong estimation like we
> are seeing in our TPCH case.

I agree with the point that the total_pages included the cache effects
and rescan when loop_count > 1, that can be avoided if we always
calculate heap_pages as it is calculated in the else part
(loop_count=0). Fortunately, in all the TPCH query plan what I posted
up thread bitmap scan was never at the inner side of the NLJ so
loop_count was always 0. I will fix this.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-10-06 16:37:15 Re: Proposal: Local indexes for partitioned table
Previous Message Ashutosh Bapat 2017-10-06 15:48:02 Re: Partition-wise join for join between (declaratively) partitioned tables