Quick Links

Re: Import Statistics in postgres_fdw before resorting to sampling.

From:	Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To:	Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc:	Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, jkatz(at)postgresql(dot)org, nathandbossart(at)gmail(dot)com
Subject:	Re: Import Statistics in postgres_fdw before resorting to sampling.
Date:	2026-02-12 14:29:34
Message-ID:	CADkLM=cU1YW4yeW-osNGLkhWQp+p6bt0MYUizYE-Vw87pG-igg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Jan 29, 2026 at 2:20 PM Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
wrote:

>
>> The way this is implemented, it will favour the usecases where foreign
>> tables are not child tables.
>
>
> It is true that this feature does not benefit the recursive
> do_analyze_rel() case. But it does help when those same tables are analyzed
> directly.
>
>
>> That leaves out the sharding use case
>> which I believe is also a significant usecase. I think we need to
>> think, how can we make that usecase benefit from this optimization.
>
>
> I agree that we should find a way to do that, but this handles the other
> case, and doesn't prevent us from later teaching
> postgresAnalyzeForeignTable() to use cache the rowsample locally for later
> use, which postgresImportStatistics() could then consider the relative
> benefits of using that local cached sample vs the already formed remote
> statistics. Even in that case, I'm guessing that the remote table's stats
> will be based on a larger and therefore better sample size then the sample
> we are able to pull across the wire and cache locally, so the remotely
> computed statistics would be better.
>
> Not being able to use statistics available on the remote side seems a
>> major limitation. But I don't have a better solution than to think of
>> supporting some kind of partial statistics.
>
>
> I'm not against trying to fetch and cache rowsamples, or cache some
> partially aggregated results of a rowsample, but this patch does not cover
> that. This patch should, at least in theory, reduce the number of table
> samples pulled across the wire by 50% and that seems worthwhile.
>
>

Rebase with some error message cleanups.

Attachment	Content-Type	Size
v13-0001-Add-FDW-functions-for-importing-optimizer-statis.patch	text/x-patch	5.0 KB
v13-0002-Add-remote-statistics-fetching-to-postgres_fdw.patch	text/x-patch	36.9 KB
v13-0003-Add-remote_analyze-to-postgres_fdw-remote-statis.patch	text/x-patch	10.8 KB

In response to

Re: Import Statistics in postgres_fdw before resorting to sampling. at 2026-01-29 19:20:26 from Corey Huinker

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tatsuya Kawata	2026-02-12 14:33:55	Re: [PATCH] Add sampling statistics to autoanalyze log output
Previous Message	Heikki Linnakangas	2026-02-12 14:21:21	Re: pgsql: Introduce pg_shmem_allocations_numa view