Re: Add expressions to pg_restore_extended_stats()

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject: Re: Add expressions to pg_restore_extended_stats()
Date: 2026-02-03 02:17:03
Message-ID: CADkLM=dWG-1oYFPYAXpL9UZ-C2VOsQxrtwsgmvL_vuLzAkOedA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> >> For this warning in the regression tests, you should only need one
> >> element, reducing the number of input lines?
> >
> > That's true IF we decide that missing expression keys are a thing that we
> > allow, and per conversation above it's not clear yet that we do.
>
> If ANALYZE can generate some partial data for a set of expression (aka
> generating some data for a portion of the expressions defined but skip
> some of them because of reason x_y_z), the restore function needs to
> cope with this policy (I'd need to double-check the code to be sure,
> not sure on top of my mind now and I cannot do that today,
> unfortunately).
>

In this latest patch, I decided that we can tolerate missing exprs
parameters, and as you'll see from the 0002 patch, it does result in
considerable decluttering.

Hmm. I would not rely on the ordering of the items as they are
> scanned, that seems like a recipe for disaster. We have a text
> representation of the expression, as of pg_stats_ext_exprs.expr. This
> could be mapped with the elements of hte text[] in pg_stats_ext.exprs.
> Here is a wilder idea: why not just putting the expression text itself
> in the data given in input of the restore function rather than a
> guessed argument number? For the case of manual stats injections,
> that kinds of makes things simpler.
>

The patch I submitted here hasn't yet implemented using the expression text
as a key, and honestly I'm not too hyped on the prospect of trying.

I think that:

1. I don't think we can guarantee that the expression node text is stable
across major versions, and that would break upgrades, the primary function
of this code.
2. Anyone wanting to modify/hack the exprs values has almost certainly
extracted it using the jsonb_build_object() code in pg_dump, so they
already have all expressions before editing.
3. Array unnest() has proven to give a stable order in all tests so far.
4. We don't decompose mcv into it's parts, so why do that for exprs?

Yeah, it sounds to me that we should just set ok=false and give up
> rather than have a semi-filled set of numbers for a single extended
> stats object. There is an argument in favor of that: it can simplify
> the detection of missing stats for a single extended stats definition.
> I understand that you'd want to keep going with loading the data even
> if it's partial. My question is: is it possible for ANALYZE to fill
> in only a portion of the expressions and can these be partially
> skipped? If the answer to my question is yes, the restore function
> should do the same and my idea of the matter is wrong. If the answer
> to my question is no, then your idea on this matter is the right one.
>

In which case I think you'll like the latest patchset.

>
> > 4. Does casting the numeric scalar values (null_frac, correlation,
> > avg_width, n_distinct) to text make sense, since we have to put them
> > through type-specific input functions anyway?
>
> In terms of JSON, it makes the use of a representation simpler. I
> don't think that we need to apply a strict casting when fetching them.
>

Everything is jbvStrings now.

> @Tomas (now added in CC for confirmation): would you see a problem
> against applying a JSONB data type to the argument for the restore of
> extended stats? This level of data serialization would be required
> when inserting data for what would show up into pg_stats_ext_exprs.
>

Very interested to hear his thoughts as well.

Attachment Content-Type Size
v3-0001-Add-support-for-exprs-in-pg_restore_extended_stat.patch text/x-patch 73.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2026-02-03 02:25:41 Re: pg_resetwal: Fix wrong directory in log output
Previous Message Fujii Masao 2026-02-03 02:16:00 Re: Wake up backends immediately when sync standbys decrease