Re: Multivariate MCV list vs. statistics target

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Multivariate MCV list vs. statistics target
Date: 2019-07-19 23:12:21
Message-ID: 20190719231221.2udexexu4k7lkbrb@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 19, 2019 at 06:12:20AM +0000, Jamison, Kirk wrote:
>On Tuesday, July 9, 2019, Tomas Vondra wrote:
>> >apparently v1 of the ALTER STATISTICS patch was a bit confused about
>> >length of the VacAttrStats array passed to statext_compute_stattarget,
>> >causing segfaults. Attached v2 patch fixes that, and it also makes sure
>> >we print warnings about ignored statistics only once.
>> >
>> v3 of the patch, adding pg_dump support - it works just like when you tweak
>> statistics target for a column, for example. When the value stored in the
>> catalog is not -1, pg_dump emits a separate ALTER STATISTICS command setting
>> it (for the already created statistics object).
>Hi Tomas, I stumbled upon your patch.
>According to the CF bot, your patch applies cleanly, builds successfully, and
>passes make world. Meaning, the pg_dump tap test passed, but there was no
>test for the new SET STATISTICS yet. So you might want to add a regression
>test for that and integrate it in the existing alter_generic file.
>Upon quick read-through, the syntax and docs are correct because it's similar
>to the format of ALTER TABLE/INDEX... SET STATISTICS... :
> ALTER [ COLUMN ] column_name SET STATISTICS integer
>+ /* XXX What if the target is set to 0? Reset the statistic? */
>This also makes me wonder. I haven't looked deeply into the code, but since 0 is
>a valid value, I believe it should reset the stats.

I agree resetting the stats after setting the target to 0 seems quite
reasonable. But that's not what we do for attribute stats, because in
that case we simply skip the attribute during the future ANALYZE runs -
we don't reset the stats, we keep the existing stats. So I've done the
same thing here, and I've removed the XXX comment.

If we want to change that, I'd do it in a separate patch for both the
regular and extended stats.

>After lookup though, this is how it's tested in ALTER TABLE:
>/test/regress/sql/stats_ext.sql:-- Ensure things work sanely with SET STATISTICS 0

Well, yeah. But that tests we skip building the extended statistic
(because we excluded the column from the ANALYZE run).

>> I've considered making it part of CREATE STATISTICS itself, but it seems a
>> bit cumbersome and we don't do it for columns either. I'm not against adding
>> it in the future, but at this point I don't see a need.
>I agree. Perhaps that's for another patch should you decide to add it in the future.


Attached is v4 of the patch, with a couple more improvements:

1) I've renamed the if_not_exists flag to missing_ok, because that's
more consistent with the "IF EXISTS" clause in the grammar (the old flag
was kinda the exact opposite), and I've added a NOTICE about the skip.

2) I've renamed ComputeExtStatsTarget to ComputeExtStatsRows, because
that's what the function was doing anyway (computing sample size).

3) I've added a couple of regression tests to stats_ext.sql

Aside from that, I've cleaned up a couple of places and improved a bunch
of comments. Nothing huge.


Tomas Vondra
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
alter-statistics-v4.patch text/plain 30.5 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-07-19 23:13:37 Re: should there be a hard-limit on the number of transactions pending undo?
Previous Message Peter Geoghegan 2019-07-19 22:47:23 Re: should there be a hard-limit on the number of transactions pending undo?