From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | John Naylor <jcnaylor(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: MCV lists for highly skewed distributions |
Date: | 2018-03-17 19:32:12 |
Message-ID: | CAEZATCWTSwsV11Xc9fboSzWFoKi_Gz9MQh-P+weP105DL4E0HA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 17 March 2018 at 18:40, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> Currently, analyze_mcv_list only checks if the frequency of the current
> item is significantly higher than the non-MCV selectivity. My question
> is if it shouldn't also consider if removing the item from MCV would not
> increase the non-MCV selectivity too much.
>
Oh, I see what you're saying. In theory, each MCV item we remove is
not significantly more common than the non-MCV items at that point, so
removing it shouldn't significantly increase the non-MCV selectivity.
It's possible the cumulative effect of removing multiple items might
start to add up, but I think it would necessarily be a slow effect,
and I think it would keep getting slower and slower as more items are
removed -- isn't this equivalent to constructing a sequence of numbers
where each number is a little greater than the average of all the
preceding numbers, and ends up virtually flat-lining.
Regards,
Dean
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-03-17 19:32:33 | Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3) |
Previous Message | Tom Lane | 2018-03-17 19:25:57 | Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3) |