| From: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
|---|---|
| To: | Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: use of SPI by postgresImportForeignStatistics |
| Date: | 2026-06-30 15:55:03 |
| Message-ID: | CADkLM=enKc-3-uTjUSXXhoRhSKDHVdb_ate+S2dtkEcscW5bvA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Jun 30, 2026 at 7:30 AM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
wrote:
> On Tue, Jun 30, 2026 at 10:10 AM Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
> wrote:
> > On Mon, Jun 29, 2026 at 2:17 PM Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> >> On Mon, Jun 29, 2026 at 7:50 AM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
> wrote:
> >> > I like this refactoring, but while it's rather mechanical, it's pretty
> >> > large, so I think it's too late to do the refactoring at this time
> >> > just before the beta 2 release. So I'd vote for going with your
> >> > v1-0001 and v1-0002 and doing the refactoring in v20. As mentioned by
> >> > Robert, I don't think it's good to call LOCAL_FCINFO() in
> >> > import_relation_statistics() and import_attribute_statistics() to call
> >> > the guts of those functions either, but that is *consistent* with the
> >> > existing way pg_restore_relation_stats() and
> >> > pg_restore_attribute_stats() do that, so that is actually not that
> >> > bad. Also, as you mentioned above, it's inefficient for the new API
> >> > functions to lock an already-locked relation, and validate an
> >> > already-validated attname/attnum, but I think it would be negligible.
> >>
> >> Fujita-san, is it then your plan to get those two patches committed?
>
> Yes, it is.
>
> > Are we ok with changing the functions that postgres_fdw makes from v19
> to v20? Because what I put in the v2 patch does not match the v1 patch. If
> we are NOT ok with that, then we'd need to:
> >
> > 1. rename the two relation_stats_argnum -> relation_args_argnum and
> attribute_stats_argnum -> attribute_args_argnum out of the way of the ones
> created in relation_stats.h and attribute_stats.h. The actual enumeration
> values can stay the same, as there are no conflicts there, just ambiguity
> about which set of values they belong to.
> > 2. rename the existing static functions relation_statistics_update ->
> update_relstats and attribute_statistics_update -> update_relstats to make
> way for the public functions of the same names.
> > 3. Create new relation_statistics_update and
> attribute_statistics_update, with the isnull/values, and have those
> fcinfo-invoke the respective pg_restore functions just like v1 does.
> >
> > That would at least make the user API consistent in v19 vs v20.
> >
> > I'm on vacation this week, but time is short for beta2, so I'll make an
> exception to get the above into beta2. Let me know if you want me to write
> that up.
>
> Thanks!
>
> Here is an updated patch, which merges your v1-0001 and v1-0002. One
> notable change I made to your version is the signatures of
> import_relation_statistics() and import_attribute_statistics(). As
> mentioned upthread, the proposed signatures of these functions would
> be too tailored for postgres_fdw, and might not be useful for other
> FDWs, so I changed them to have NullableDatum arguments for stats-data
> inputs like this:
>
> bool
> import_relation_statistics(Relation rel,
> NullableDatum *relpages,
> NullableDatum *reltuples,
> NullableDatum *relallvisible,
> NullableDatum *relallfrozen)
>
> bool
> import_attribute_statistics(Relation rel, AttrNumber attnum, bool
> inherited,
> NullableDatum *null_frac,
> NullableDatum *avg_width,
> NullableDatum *n_distinct,
> NullableDatum *most_common_vals,
> NullableDatum *most_common_freqs,
> NullableDatum *histogram_bounds,
> NullableDatum *correlation,
> NullableDatum *most_common_elems,
> NullableDatum *most_common_elem_freqs,
> NullableDatum *elem_count_histogram,
> NullableDatum *range_length_histogram,
> NullableDatum *range_empty_frac,
> NullableDatum *range_bounds_histogram)
>
> As postgresImportForeignStatistics() is provided for FDWs that want to
> calculate statistics remotely, so I'm assuming that they have enough
> knowledge about the stats-data structures.
> I think we could instead
> use the values/isnull arrays as proposed in your v2, but the problem
> about doing so is error messages in functions in stat_utils.c like
> this:
>
> ereport(WARNING,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("argument \"%s\" must not be a multidimensional
> array",
> arginfo[argnum].argname)));
>
> Note that the message uses the word "argument", so I think it's better
> for the new functions as well to have individual arguments for
> stats-data inputs.
That's a good example of how our error messages get a bit clumsy with so
many ways of calling things, so I'm fine with your solution. It also
eliminates the possibility that a caller fails to notice a newly added stat
parameter and instead calls the input function with the older short array
length.
My initial concern was that this would bloat up postgres_fdw with type
conversion code, but the patch shows that it isn't that bad.
One thing we do lose in this is the assurance that the Datums passed are of
the correct type. Presently that is checked via
stats_fill_fcinfo_from_arg_pairs, which allows the *_statistics_update()
functions to trust the type correctness of the Datums that they're passed,
but I can be convinced that it isn't a big problem.
> Also, I removed the attname argument from
> import_attribute_statistics(), to 1) make the signature (and the
> internal processing) simple and 2) match it to
> delete_attribute_statistics().
>
I think that's reasonable. Internal callers have already resolved which
attribute they want to modify. The attname parameter was there so that
pg_restore_* calls could survive dump/restores where there were dropped
columns so the attnum has changed.
In relation to this change, I moved helper functions set_XXX_arg()
> that you added to relation_stats.c and attribute_stats.c to
> postgres_fdw.c. The helper functions had soft error handling, but in
> the postgres_fdw use, there is no need for that, so I removed that
> handling as well.
>
My inclination would be to move the set_*_arg functions could be moved to
some common utility file in v20, as other FDWs will find themselves with
the same need.
As for removing the error-handling, it might still be handy because it
would allow us to give a more context-aware error message, rather than the
very narrow "this_string is an invalid this_type", for string that the user
most certainly never saw. Having said that, it's something we could easily
change back to error-safe if we wanted to.
What do you think? If there is no problem with this change, we
> wouldn't need to worry about the stability of postgres_fdw (and other
> FDWs) in v20.
>
If you're fine with the names import_relation_statistics and
import_attribute_statistics, then yes. Those names made sense to me in the
narrow case of being called from an FDW handler and handing the function a
bunch of strings, but in a more general case with properly typed datums,
the name feels less appropriate, but it's just a name and I wouldn't let it
get in the way of the overall patch.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2026-06-30 15:58:18 | Re: remove pg_spin_delay() from atomics code |
| Previous Message | Nathan Bossart | 2026-06-30 15:30:25 | Re: Handle concurrent drop when doing whole database vacuum |