Re: Extended Statistics set/restore/clear functions.

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Extended Statistics set/restore/clear functions.
Date: 2026-01-25 02:19:30
Message-ID: CADkLM=f4+3_HrXSAaEic5VGGd-fOjS4jGnr1WGJYrL7vf3ZJnw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
> Yes, we are entering into some complexity territory here that needs a
> very careful lookup, and I am not sure yet if your suggestion of
> input arguments is the best thing to do.
>

With that in mind, I'm going to move forward with a local static version of
statatt_get_type() which avoids the typecache stuff altogether. Whatever
remains there will be instructive for how we keep/lose/modify the real
statatt_get_type().

>
> I have been thinking about how to move this patch forward, and I think
> that we should split things a bit more the logic (aka my usual
> divide-and-conquer strategy) for the restore function, because we
> should make sure that each part is sound before merging, and that's
> also a risk reduction if we need to revert one part or the other:
> 1) One patch for ndistinct.
> 2) One patch for dependencies.
> 3) One patch for MVC.
>

All three of these are possible, though the tests of trying to set one of
the three types on a stat object incapable of holding it won't be add-able
until the next type of stats are settlable.

Expressions would be a 4th patch, and honestly it's the lion's share of the
code.

> 4) Dump/restore.
>
> The "exprs" input argument is required by all the three others, meaning
> that it needs to stand in the first bits of the patch. The order of
> ndistinct and dependencies does not matter, let's just do these based
> on the order of the input arguments. Dump/restore could be already
> integrated even if we have only 1) and 2) restore integrated in the
> tree. It also seems like the "exprs" argument should stand on top of
> all the others, as well, so it would make more sense to have it after
> "inherited" in extended_stats_argnum?
>

I think I went with the order they appeared in pg_stats_ext, with exprs
tacked on the end. I'm a bit confused why the argument order matters given
that the only user exposure is the kwargs-like interface.

>
> There are two reasons behind this set of thoughts:
> 1) The regression tests are bloated. We rely currently on one single
> extended stats object that holds all the types of stats (MCV,
> dependencies, ndistinct), meaning that for some of the valid cases we
> need to cary all the types in the input of restore function. It's
> also a bit true in terms of the invalid cases. One thing that I would
> recommend to do here is create two simpler extstat objects for each
> stxkind (one with expression, one without expression). You should
> still require the "global" extstat object that includes all the
> stxkind, which counts for the expression import path, of course.
>

We can definitely lean on the one-stat-only objects more now that we have
them.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-01-25 02:35:50 Re: Extended Statistics set/restore/clear functions.
Previous Message Michael Paquier 2026-01-25 01:26:33 Re: Add WALRCV_CONNECTING state to walreceiver