Re: Selectivity estimation for inet operators

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: "emre(at)hasegeli(dot)com" <emre(at)hasegeli(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Selectivity estimation for inet operators
Date: 2014-07-02 11:08:10
Message-ID: 4205E661176A124FAF891E0A6BA913526633C027@szxeml509-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On, 15 May 2014 14:04 Emre Hasegeli Wrote,

>
> * matching first MCV to second MCV
> * searching first MCV in the second histogram
> * searching second MCV in the first histogram
> * searching boundaries of the first histogram in the second histogram
>
> Comparing the lists with each other slows down the function when
> statistics set to higher values. To avoid this problem I only use
> log(n) values of the lists. It is the first log(n) value for MCV,
> evenly separated values for histograms. In my tests, this optimization
> does not affect the planning time when statistics = 100, but does
> affect accuracy of the estimation. I can send the version without this
> optimization, if slow down with larger statistics is not a problem
> which should be solved on the selectivity estimation function.
>

I have started reviewing this patch, so far I have done basic reviews and some testing/debugging.

1. Patch applied to git head.
2. Basic testing works fine.

I have one query,

In inet_his_inclusion_selec function,
When the constant matches only the right side of the bucket, and if it’s a last bucket then it's never considered as partial match candidate.
In my opinion, if it's not a last bucket then for next bucket it will become left boundary and this will be treated as partial match so no problem, but in-case of last bucket it can give wrong selectivity.

Can't we consider it as partial bucket match if it is last bucket ?

Apart from that there is one spell check you can correct
-- in inet_his_inclusion_selec comments
histogram boundies -> histogram boundaries :)

Thanks & Regards,
Dilip Kumar

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2014-07-02 11:15:30 Re: gaussian distribution pgbench
Previous Message David Rowley 2014-07-02 09:44:12 Re: Allowing NOT IN to use ANTI joins