Re: Import Statistics in postgres_fdw before resorting to sampling.

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, jkatz(at)postgresql(dot)org, nathandbossart(at)gmail(dot)com
Subject: Re: Import Statistics in postgres_fdw before resorting to sampling.
Date: 2026-01-15 13:47:34
Message-ID: CAExHW5uEicRpM0cxEgX_0PL41V6OcHUa0v3BOdBzsQOQcBuh=Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 9, 2026 at 9:52 PM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>
> On Fri, Jan 9, 2026 at 4:35 AM Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>>
>> On Wed, Jan 7, 2026 at 11:34 AM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>>
>> >
>> > Anyway, here's v8, incorporating the documentation feedback and Matheus's notes.
>>
>> I went through the patches. I have one question: The column names of
>> the foreign table on the local server are sorted using qsort, which
>> uses C sorting. The column names the result obtained from the foreign
>> server are sorted using ORDER BY clause. If the default collation on
>> the foreign server is different from the one used by qsort(), the
>> merge sort in import_fetched_statistics() may fail. Shouldn't these
>> two sorts use the same collation?
>
>
> I wondered about that, but somehow thought that because they're of type pg_catalog.name rather than text/varchar that they weren't subject to collation settings. We definitely should explicitly sort by C collation regardless.
>
>>
>>
>> Some minor comments
>>
>> /* avoid including explain_state.h here */
>> typedef struct ExplainState ExplainState;
>> -
>>
>> unintended line deletion?
>
>
> Yeah, must have been.
>
>
>>
>>
>> fetch_remote_statistics() fetches the statistics twice if the first
>> attempt fails. I think we need same check after the second attempt as
>> well. The second attempt should not fail, but I think we need some
>> safety checks and Assert at least, in case the foreign server
>> misbehaves.
>
>
> I think the only test *not* done on re-fetch is checking the relkind of the relation, are you speaking of another check? It's really no big deal to do that one again, but I made the distinction early on in the coding, and in retrospect there are relatively few checks we'd consider skipping on the second pass, so maybe it's not worth the distinction.
>
> Since you're joining the thread, we have an outstanding debate about what the desired basic workflow should be, and I think we should get some consensus before we paint ourselves into a corner.
>
> 1. The Simplest Possible Model
>
> * There is no remote_analyze functionality
> * fetch_stats defaults to false
> * Failure to fetch stats results in a failure, no failover to sampling.
>

Instead of fetch_stats, I would prefer has_stats. But if majority is
on fetch_stats, I am ok with it.

> 2. Simplest Model, but with Failover
>
> * Same as #1, but if we aren't satisfied with the stats we get from the remote, we issue a WARNING, then fall back to sampling, trusting that the user will eventually turn off fetch_stats on tables where it isn't working.
>
> 3. Analyze and Retry
>
> * Same as #2, but we add remote_analyze option (default false).
> * If the first attempt fails AND remote_analyze is set on, then we send the remote analyze, then retry. Only if that fails do we fall back to sampling.
>
> 4. Analyze and Retry, Optimistic
>
> * Same as #3, but fetch_stats defaults to ON, because the worst case scenario is that we issue a few queries that return 0-1 rows before giving up and just sampling.
> * This is the option that Nathan advocated for in our initial conversation about the topic, and I found it quite persuasive at the time, but he's been slammed with other stuff and hasn't been able to add to this thread.
>
> 5. Fetch With Retry Or Sample, Optimisitc
>
> * If fetch_stats is on, AND the remote table is seemingly capable of holding stats, attempt to fetch them, possibly retrying after ANALYZE depending on remote_analyze.
> * If fetching stats failed, just error, as a way to prime the user into changing the table's setting.
> * This is what's currently implemented, and it's not quite what anyone wants. Defaulting fetch_stats to true doesn't seem great, but not defaulting it to true will reduce adoption of this feature.
>
> 6. Fetch With Retry Or Sample, Pessimistic
>
> * Same as #5, but with fetch_stats = false.

I have no strong opinion about whether fetch_stats should be ON or off
by default. In my simple view, the remote relation that the foreign
table points is usually a regular table or a partitioned table. So
default ON makes sense to me. But there's a possibility that in an
OLAP system most of the foreign objects are views exposing only some
data and not all. I think defaulting to off is backward compatible and
requires cautious opt-in. Or else suddenly all ANALYZE on foreign
server will be costly since they are running a couple queries before
falling back to sampling.

Why do we need remote_analyze as a flag? I might have missed some
conversation which led to including this flag. We fetch relkind, so we
know whether the object on the foreign server supports statistics or
not. If it supports statistics, we run ANALYZE automatically. I mean
the user wants us to ANALYZE the foreign table and has also indicated
that the remote object is capable of storing stats. We should document
the fact that we analyze if we do not find existing statistics. Why
would users want to disable running ANALYZE? One possibility is they
don't want to disturb query plans because of suddenly changing
statistics. But then they will need to switch auto-analysis off. So
maybe there is a way to know when it's not acceptable to run ANALYZE
on the remote server. I feel this flag will put another burden on the
user who already has to set so many flags in postgres_fdw. That is
especially true, when the flag is useful only the first time when the
remote object doesn't have stats - usually there will be stats. If at
all we have to introduce remote_analyze, let it default to the value
of fetch_stats.

If both - importing statistics and fallback analysis - fail, I think
we should resort to sampling - since the user has asked for an ANALYZE
table and sampling will achieve that. As you said, giving warnings at
every failure will nudge the user to set their options correctly.

Does that make sense?

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2026-01-15 13:50:00 Re: log_min_messages per backend type
Previous Message Álvaro Herrera 2026-01-15 13:43:57 Re: remove the unneeded header file math.h in binaryheap.c