Re: Import Statistics in postgres_fdw before resorting to sampling.

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, jkatz(at)postgresql(dot)org, nathandbossart(at)gmail(dot)com
Subject: Re: Import Statistics in postgres_fdw before resorting to sampling.
Date: 2026-01-27 05:47:08
Message-ID: CAExHW5vsQK1JVn15DYdsGAQ-qbmce_MyP7F67Ne-JpaaLRYUOg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 23, 2026 at 10:45 PM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>>
>> >> There's an advantage if we can combine stats across multiple relations
>> >> - we don't have to sample children twice when analyzing the parent
>> >> without ONLY. Instead we could produce parent statistics by combining
>> >> statistics across children and the parent. To me this looks like
>> >> altogether a different beast just like partial aggregates.
>> >
>> >
>> > I think this patch is only ever going to get us out of 1 of the 2 samples, which isn't ideal but it is a savings.
>> >
>>
>> I am not suggesting to synthesize sample rows. Calculate the
>> statistics of the parent table from that of its children.
>
>
> I'm not sure we can actually do that. The functions that compute the statistics are all based off of row samples, not already computed statistics. I don't think we can synthesize a rowsample from the imported statistics, at least not accurately. If I'm misunderstanding what you're suggesting, please correct me.

I am comparing the calculation of statistics to the calculation of
aggregates. We have code to compute aggregates on a partitioned table
from the partial aggregates computed from the individual partitions.
(Even though I am mentioning the partitioned table, the technique can
be used for an inheritance hierarchy.) Similarly if we could come up
with a representation of partial statistics, we could get partial
statistics computed for the children (and the parent in
non-partitioned inheritance). Use the partial statistics to compute
the statistics for the parent without the need to synthesize row
samples from the children. I haven't looked at all the kinds of
statistics to see whether this is feasible.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2026-01-27 05:56:21 pgsql: Prevent invalidation of newly synced replication slots.
Previous Message Chao Li 2026-01-27 05:13:32 tablecmds: fix bug where index rebuild loses replica identity on partitions