Re: Import Statistics in postgres_fdw before resorting to sampling.

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, jkatz(at)postgresql(dot)org, nathandbossart(at)gmail(dot)com
Subject: Re: Import Statistics in postgres_fdw before resorting to sampling.
Date: 2026-04-03 13:55:54
Message-ID: CAPmGK14U+wiuuqJ8Qgd8P-aZ52XO=o2DyF-sRhVp8qaJ5HduAg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 3, 2026 at 2:14 AM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>>
>> The problem is not limited to this special case. Consider cases when
>> 1) the remote table that has many rows are heavily updated after it
>> got analyzed, and then 2) postgres_fdw imports its stats before it
>> gets re-analyzed. The stats postgres_fdw imports would be stale,
>> causing plan degradation. I don't think we should enable this feature
>> by default until we guarantee stats freshness in some way.
>
> So it seems like we have the following configurations desired by at least somebody:
>
> 0. Row Sampling Only
> 1. Fetch stats and fall back to row sampling.
> 2. Always analyze remote table (assuming it is a table that can hold stats), then fetch stats, and fall back if necessary.
> 3. Fetch stats, and if that turned up 0 attribute stats try an analyze, then try to refetch and if it still fails go to row sampling.
>
> With the following interpretation of reltuples = 0:
>
> a. The table is definitively empty, stop.
> b. The table is missing stats and running an analyze is cheap (assuming remote analysis is even enabled)
> c. if remote version >= 14 then a else b
>
> I'm of the opinion that 3c is the best configuration for most tables, and you have advocated for 1a without an analyze option and 2a with one. Option 2 seems a bit heavy handed to me, but I could see checking the remote pg_stat_all_tables and making an analyze/no-analyze judgement call based on that, perhaps call that analyze_stale_vacuum_interval or something like that. That could be a neat feature for v20, and so whatever default we choose for fetch_stats, I ask that we choose values that keep our options open for all 4x3 configurations enumerated above.

Before discussing this, let's wrap up the work for v19. Here is a new
version of the patch. Changes are:

* As above, I think we should disable this feature by default for now,
so I modified the patch as such.

* You are defining a remote query per version at the top of the file
that is used in fetch_attstats, but I think that that reduces
readability, making maintenance hard, so I modified that function to
build the query dynamically in it, as I proposed before. I also
modified that function to use PQsendQuery, not PQsendQueryParams, as
in fetch_relstats, for efficiency.

* I moved the code for opening/closing the connection from
postgresImportStatistics to fetch_remote_statistics, as the connection
is only used in the latter function. I also removed useless cleanup
processing in that function.

* I added regression tests.

* I fixed a minor bug I noticed while adding those tests.

* I did some cleanup and editorialization including re-ordering
functions in logical order.

One thing I'm wondering is: we should rename ImportStatistics to
ImportForeignStatistics, for consistency with ImportForeignSchema?
Also, we should rename fetch_stats to import_stats, for consistency
with eg, import_default?

Anyway I think the patch is now in good shape, except comments/docs,
which I will work on next.

Best regards,
Etsuro Fujita

Attachment Content-Type Size
v19-Add-remote-statistics-fetching-to-postgres_fdw.patch application/octet-stream 53.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Imran Zaheer 2026-04-03 13:56:01 Remove commented-out code in 026_overwrite_contrecord.pl
Previous Message Nathan Bossart 2026-04-03 13:54:36 Re: vectorized CRC on ARM64