Re: postgres_fdw: using TABLESAMPLE to collect remote sample

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: James Finnerty <jfinnert(at)amazon(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: postgres_fdw: using TABLESAMPLE to collect remote sample
Date: 2023-01-07 15:20:10
Message-ID: 3447192.1673104810@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> However, maybe views are not the best / most common example to think
> about. I'd imagine it's much more common to reference a regular table,
> but the table gets truncated / populated quickly, and/or the autovacuum
> workers are busy so it takes time to update reltuples. But in this case
> it's also quite simple to fix the correlation by just ordering by ctid
> (which I guess we might do based on the relkind).

> There's a minor issue with partitioned tables, with foreign partitions
> pointing to remote views. This is kinda broken, because the sample size
> for individual relations is determined based on relpages. But that's 0
> for views, so these partitions get ignored when building the sample. But
> that's a pre-existing issue.

I wonder if we should stop consulting reltuples directly at all,
and instead do

"EXPLAIN SELECT * FROM remote_table"

to get the remote planner's estimate of the number of rows involved.
Even for a plain table, that's likely to be a better number.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zheng Li 2023-01-07 15:28:11 Re: Support logical replication of DDLs
Previous Message Tomas Vondra 2023-01-07 14:07:58 Re: postgres_fdw: using TABLESAMPLE to collect remote sample