Re: extended stats on partitioned tables

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: extended stats on partitioned tables
Date: 2021-11-03 22:48:44
Message-ID: 46f37069-55cb-db6a-a15a-e351a2606592@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/8/21 12:45 AM, Justin Pryzby wrote:
> On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote:
>> On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:
>>> On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:
>>>> It seems like your patch should also check "inh" in examine_variable and
>>>> statext_expressions_load.
>>>
>>> I tried adding that - I mostly kept my patches separate.
>>> Hopefully this is more helpful than a complication.
>>> I added at: https://commitfest.postgresql.org/35/3332/
>>>
>>
>> Actually, this is confusing. Which patch is the one we should be
>> reviewing?
>
> It is confusing, but not as much as I first thought. Please check the commit
> messages.
>
> The first two patches are meant to be applied to master *and* backpatched. The
> first one intends to fixes the bug that non-inherited stats are being used for
> queries of inheritance trees. The 2nd one fixes the regression that stats are
> not collected for inheritence trees of partitioned tables (which is the only
> type of stats they could ever possibly have).
>

I think 0001 and 0002 seem mostly fine, but it seems a bit strange to do
the (!rte->inh) check in the rel->statlist loops. AFAICS both places
could do that right at the beginning, because it does not depend on the
statistics object at all, just the RelOptInfo.

> And the 3rd+4th patches (Tomas' plus my changes) allow collecting both
> inherited and non-inherited stats, only in master, since it requires a catalog
> change. It's a bit confusing that patch #4 removes most what I added in
> patches 1 and 2. But that's exactly what's needed to collect and apply both
> inherited and non-inherited stats: the first two patches avoid applying stats
> collected with the wrong inheritence. That's also what's needed for the
> patchset to follow the normal "apply to master and backpatch" process, rather
> than 2 patches which are backpatched but not applied to master, and one which
> is applied to master and not backpatched..
>

Yeah. Af first I was a bit confused because after applying 0003 there
are both the fixes and the "correct" way, but then I realized 0004
removes the unnecessary bits.

The one thing 0003 still needs is to rework the places that need to
touch both inh and !inh stats. The patch simply does

for (inh = 0; inh <= 1; inh++) { ... }

but that feels a bit too hackish. But if we don't know which of the two
stats exist, I'm not sure what to do about it. And I'm not sure we do
the right thing after removing children, for example (that should drop
the inheritance stats, I guess).

The 1:2 mapping between pg_statistic_ext and pg_statistic_ext_data is a
bit strange, but I can't think of a better way.

> @Tomas: I just found commit 427c6b5b9, which is a remarkably similar issue
> affecting column stats 15 years ago.
>

What can I say? The history repeats itself ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-11-03 22:56:41 Re: Extending amcheck to check toast size and compression
Previous Message Justin Pryzby 2021-11-03 22:39:13 Re: should we enable log_checkpoints out of the box?