Re: More stable query plans via more predictable column statistics

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: More stable query plans via more predictable column statistics
Date: 2016-04-01 23:57:32
Message-ID: CACACo5TS-v4KkU-LdgZ-uhbKpSRWb1rjGaC5o3uWn9Gwvi2MQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Apr 1, 2016 23:14, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de> writes:
> > Alright. I'm attaching the latest version of this patch split in two
> > parts: the first one is NULLs-related bugfix and the second is the
> > "improvement" part, which applies on top of the first one.
>
> I've applied the first of these patches,

Great news, thank you!

> broken into two parts first
> because it seemed like there were two issues and second because Tomas
> deserved primary credit for one part, ie realizing we were using the
> Haas-Stokes formula wrong.
>
> As for the other part, I committed it with one non-cosmetic change:
> I do not think it is right to omit "too wide" values when considering
> the threshold for MCVs. As submitted, the patch was inconsistent on
> that point anyway since it did it differently in compute_distinct_stats
> and compute_scalar_stats. But the larger picture here is that we define
> the MCV population to exclude nulls, so it's reasonable to consider a
> value as an MCV even if it's greatly outnumbered by nulls. There is
> no such exclusion for "too wide" values; those things are just an
> implementation limitation in analyze.c, not something that is part of
> the pg_statistic definition. If there are a lot of "too wide" values
> in the sample, we don't know whether any of them are duplicates, but
> we do know that the frequencies of the normal-width values have to be
> discounted appropriately.

Okay.

> Haven't looked at 0002 yet.

[crosses fingers] hope you'll have a chance to do that before feature
freeze for 9.6…

--
Alex

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-04-02 00:14:24 Re: Speedup twophase transactions
Previous Message David G. Johnston 2016-04-01 22:38:38 Re: syntax sugar for conditional check