Re: Import Statistics in postgres_fdw before resorting to sampling.

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, jkatz(at)postgresql(dot)org, nathandbossart(at)gmail(dot)com
Subject: Re: Import Statistics in postgres_fdw before resorting to sampling.
Date: 2026-04-01 11:39:19
Message-ID: CAPmGK17KWkMTOMnB_qTy+8aJ9zz4rUPHLBAUyP4cHeUOxzM5sQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 31, 2026 at 5:04 AM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:

>> postgres_fdw side:
>>
>> * In fetch_remote_statistics, if we get reltuples=0 for v14 or later,
>> I think we should update only the relation stats with that info, and
>> avoid resorting to analyzing, for efficiency, as I proposed before. I
>> modified that function (and import_fetched_statistics) that way.
>
> This will miss out on the case where the remote table did get analyzed once, when empty, but now isn't empty. I realize that shouldn't happen very often, but the cost of rowsampling a table that is empty is very low.

I think that that would be the user's fault, as it's the user's
responsibility to ensure that the existing stats for the remote table
are up-to-date. From another perspective, not all users will be able
to operate in such a way, so I'm thinking of disabling this feature by
default.

> I see that remote_analyze didn't make it as a part of this patch. Is that something you'd repackaged as a follow-on patch, or are you just done with it?

As just reviewing/polishing the 0001/0002 patches is a lot of work, I
didn't have time to look at the remote_analyze patch. We are running
out of time, so I'm afraid that I won't be able to have time for that.

I modified the patch further:

* Modified postgresImportStatistics to create RemoteAttributeMapping if needed.

* The query executed in fetch_relstats is almost the same as the one
executed in postgresGetAnalyzeInfoForForeignTable. To avoid code
duplication, I modified it to use the latter query. I also changed it
to use PQsendQuery, not PQsendQueryParams, for efficiency.

* Modified import_spi_query_ok to get the result of an import query by
using SPI_getbinval, not SPI_getvalue, for efficiency.

Attached is a new version of the patch.

Thanks for reviewing!

Best regards,
Etsuro Fujita

Attachment Content-Type Size
v18-Add-remote-statistics-fetching-to-postgres_fdw.patch application/octet-stream 51.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2026-04-01 11:42:17 Re: Adding REPACK [concurrently]
Previous Message Alvaro Herrera 2026-04-01 11:38:16 Re: Adding REPACK [concurrently]