Re: MCV lists for highly skewed distributions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: John Naylor <jcnaylor(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MCV lists for highly skewed distributions
Date: 2018-02-07 13:29:58
Message-ID: CA+TgmoaqvtPa+YwU6iHXS7LPk_Z1fFxYjmDdviunatq46w7EqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 7, 2018 at 3:51 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> Thanks for testing. I agree, this new algorithm seems to stand up
> pretty well in the testing I've done too. One thing about it that can
> be tuned is the cutoff threshold for the relative standard error -- I
> chose 10% completely arbitrarily, but it could just as easily be 20%,
> 30% or even higher (in the patch s/0.01/0.04/ or s/0.01/0.09).

Maybe it'd be worth having a separate GUC for this, and a reloption to
override the GUC. It seems to me that it would be very reasonable to
want to separately control (a) how much sampling you want to do and
(b) how aggressively you want to be about including things in the MCV
list. Of course, even if we do that, it doesn't excuse us from
needing to set a good default value. And it might not be necessary to
do the extra work for this.

Looking at your data, it definitely seems like 10% would be too strict
-- if I'm reading this correctly, with a stats target in the 10-50
range, your normally-distributed table gets no MCVs at all, rather
than a number equal to the statistics target. That doesn't seem good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2018-02-07 13:37:44 Re: [HACKERS] path toward faster partition pruning
Previous Message Robert Haas 2018-02-07 13:19:44 Re: [HACKERS] path toward faster partition pruning