Re: [HACKERS] PATCH: multivariate histograms and MCV lists

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Mark Dilger <hornschnorter(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Date: 2019-01-14 19:21:45
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On 1/14/19 4:31 PM, Tomas Vondra wrote:
> On 1/14/19 12:20 PM, Dean Rasheed wrote:
>> (Removing Adrien from the CC list, because messages to that address
>> keep bouncing)
>> On Sun, 13 Jan 2019 at 00:31, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>> On 1/10/19 6:09 PM, Dean Rasheed wrote:
>>>> In the previous discussion around UpdateStatisticsForTypeChange(), the
>>>> consensus appeared to be that we should just unconditionally drop all
>>>> extended statistics when ALTER TABLE changes the type of an included
>>>> column (just as we do for per-column stats), since such a type change
>>>> can rewrite the data in arbitrary ways, so there's no reason to assume
>>>> that the old stats are still valid. I think it makes sense to extract
>>>> that as a separate patch to be committed ahead of these ones, and I'd
>>>> also argue for back-patching it.
>>> Wasn't the agreement to keep stats that don't include column values
>>> (functional dependencies and ndistinct coefficients), and reset only
>>> more complex stats? That's what happens in master and how it's extended
>>> by the patch for MCV lists and histograms.
>> Ah OK, I misremembered the exact conclusion reached last time. In that
>> case the logic in UpdateStatisticsForTypeChange() looks wrong:
>> /*
>> * If we can leave the statistics as it is, just do minimal cleanup
>> * and we're done.
>> */
>> if (!attribute_referenced && reset_stats)
>> {
>> ReleaseSysCache(oldtup);
>> return;
>> }
>> That should be "|| !reset_stats", or have more parentheses.
> Yeah, it should have been
> if (!(attribute_referenced && reset_stats))
> i.e. there's a parenthesis missing. Thanks for noticing this. I guess a
> regression test for this would be useful.
>> In fact, I think that computing attribute_referenced is unnecessary
>> because the dependency information includes the columns that the
>> stats are for and ATExecAlterColumnType() uses that, so
>> attribute_referenced will always be true.
> Hmmm. I'm pretty sure I came to the conclusion it's in fact necessary,
> but I might be wrong. Will check.

Turns out you were right - the attribute_referenced piece was quite
unnecessary. So I've removed it. I've also extended the regression tests
to verify changing type of another column does not reset the stats.


Tomas Vondra
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-multivariate-MCV-lists-20190114.patch.gz application/gzip 40.5 KB
0002-multivariate-histograms-20190114.patch.gz application/gzip 47.2 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-01-14 20:48:32 Re: "SELECT ... FROM DUAL" is not quite as silly as it appears
Previous Message Andres Freund 2019-01-14 18:47:48 Re: Reducing header interdependencies around heapam.h et al.