Re: Import Statistics in postgres_fdw before resorting to sampling.

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, jkatz(at)postgresql(dot)org, nathandbossart(at)gmail(dot)com
Subject: Re: Import Statistics in postgres_fdw before resorting to sampling.
Date: 2026-01-14 19:41:34
Message-ID: CADkLM=fsef+NHPjCR4FXF=9wu6Bsf=0E7MOQKOs4AfHJYuF31w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Since you're joining the thread, we have an outstanding debate about what
> the desired basic workflow should be, and I think we should get some
> consensus before we paint ourselves into a corner.
>
> 1. The Simplest Possible Model
>
> * There is no remote_analyze functionality
> * fetch_stats defaults to false
> * Failure to fetch stats results in a failure, no failover to sampling.
>
> 2. Simplest Model, but with Failover
>
> * Same as #1, but if we aren't satisfied with the stats we get from the
> remote, we issue a WARNING, then fall back to sampling, trusting that the
> user will eventually turn off fetch_stats on tables where it isn't working.
>
> 3. Analyze and Retry
>
> * Same as #2, but we add remote_analyze option (default false).
> * If the first attempt fails AND remote_analyze is set on, then we send
> the remote analyze, then retry. Only if that fails do we fall back to
> sampling.
>
> 4. Analyze and Retry, Optimistic
>
> * Same as #3, but fetch_stats defaults to ON, because the worst case
> scenario is that we issue a few queries that return 0-1 rows before giving
> up and just sampling.
> * This is the option that Nathan advocated for in our initial conversation
> about the topic, and I found it quite persuasive at the time, but he's been
> slammed with other stuff and hasn't been able to add to this thread.
>
> 5. Fetch With Retry Or Sample, Optimisitc
>
> * If fetch_stats is on, AND the remote table is seemingly capable of
> holding stats, attempt to fetch them, possibly retrying after ANALYZE
> depending on remote_analyze.
> * If fetching stats failed, just error, as a way to prime the user into
> changing the table's setting.
> * This is what's currently implemented, and it's not quite what anyone
> wants. Defaulting fetch_stats to true doesn't seem great, but not
> defaulting it to true will reduce adoption of this feature.
>
> 6. Fetch With Retry Or Sample, Pessimistic
>
> * Same as #5, but with fetch_stats = false.
>

Rebased after adding the COLLATE argument to the ORDER-BY statements.

Attachment Content-Type Size
v9-0001-Add-remote-statistics-fetching-to-postgres_fdw.patch text/x-patch 42.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kai Wagner 2026-01-14 19:52:57 Re: how to gate experimental features (SQL/PGQ)
Previous Message David Christensen 2026-01-14 19:26:03 [PATCH] Fix incorrect parser comment