| From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
|---|---|
| To: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
| Cc: | Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, jkatz(at)postgresql(dot)org, nathandbossart(at)gmail(dot)com |
| Subject: | Re: Import Statistics in postgres_fdw before resorting to sampling. |
| Date: | 2026-01-22 10:15:46 |
| Message-ID: | CAExHW5vmTR1DQKa6yCOt1bU4KY+AkMmWrAa_kd6RAPg5Hpaw=g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Jan 22, 2026 at 2:21 AM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>>
>> Changes in this release, aside from rebasing:
>>
>> - The generic analyze and fdw.h changes are in their own patch (0001) that ignores contrib/postgres_fdw entirely.
>> - The option for remote_analyze has been moved to its own patch (0003).
>> - The errors raised are now warnings, to ensure that we can always fall back to row sampling.
>> - All local attributes with attstatarget > 0 must get matching remote statistics or the import is considered a failure.
>> - The pg_restore_attribute_stats() call has been turned into a prepared statement, for clarity and some minor parsing savings.
>> - The calls to pg_restore_relation_stats() are parameterized, but not prepared as this is rarely called more than once.
>> - postgresStatisticsAreImportable will now disqualify a table if has extended statistics objects, because we can't compute those without a row sample.
>
Thanks Corey for breaking down these patches. It makes reviewing easier.
analyze_rel() and acquire_inherited_sample_rows() both call
fdwroutine->AnalyzeForeignTable() but only the first one uses the
statistics import facility. Is that intentional? Typical use case of
sharding will create a partitioned table with foreign tables as
partitions. The partitions will be analyzed by the second function.
Thus a big use case of postgres_fdw won't be able to use the import
statistics facility. That seems like a major drawback of this patch.
Thinking more about it, acquire_inherited_sample_rows() accumulates
the sample rows from the child tables and extracts statistics from
those rows and then updates corresponding pg_statistics rows. Doing
that through import statistics seems a bit tricky since we need to be
able to combine statistics from multiple relations. Can we do that?
There's an advantage if we can combine stats across multiple relations
- we don't have to sample children twice when analyzing the parent
without ONLY. Instead we could produce parent statistics by combining
statistics across children and the parent. To me this looks like
altogether a different beast just like partial aggregates.
It will be good to fix this drawback. If not, at least we should
figure out (plan/POC) how to deal with the child tables? We need to at
least document this drawback - the documentation in the current patch
reads as if all foreign tables will use this facility when available.
--
Best Wishes,
Ashutosh Bapat
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Antonin Houska | 2026-01-22 10:32:58 | Re: Race conditions in logical decoding |
| Previous Message | Xuneng Zhou | 2026-01-22 10:06:21 | Re: Add WALRCV_CONNECTING state to walreceiver |