Re: MCV lists for highly skewed distributions

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, John Naylor <jcnaylor(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MCV lists for highly skewed distributions
Date: 2018-02-07 08:32:09
Message-ID: CAEZATCVwr4yH=9r-H8mtAtLxXGeREqEtGVJDA9iLOaHY9PYCDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1 February 2018 at 17:49, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> One point which I want to emphasize is that the length of the MCV list
> bounds the estimated frequency of non-MCVs in two ways: no non-MCV is
> ever thought to be more frequent than the least-common MCVs, and
> however many non-MCVs we think we have (probably fewer than we
> actually have) have to fit into whatever percentage of the table is
> consumed by MCVs. This would be less important if we had reliable
> n_distinct estimates, but we don't. So, even throwing things into the
> MCV list that are no more common than the average item can improve
> planning in some cases.
>

That's a good point, and a nice explanation. I think that lends more
weight to the argument that we should be including as many MCVs as
possible, provided there's enough evidence to justify their inclusion.

Regards,
Dean

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2018-02-07 08:42:51 Re: [HACKERS] path toward faster partition pruning
Previous Message amul sul 2018-02-07 08:31:50 Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key