Re: estimation problems for DISTINCT ON with FDW

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: estimation problems for DISTINCT ON with FDW
Date: 2020-07-02 02:46:37
Message-ID: CAPmGK15afZcgRKPHzn4oZ3aat3qjWWeMWj0nvaEwa8DXeYY7yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 1, 2020 at 11:40 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> writes:
> > On Wed, Jul 1, 2020 at 7:21 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> + baserel->tuples = Max(baserel->tuples, baserel->rows);
>
> > for consistency, this should be
> > baserel->tuples = clamp_row_est(baserel->rows / sel);
> > where sel is the selectivity of the baserestrictinfo clauses?
>
> If we had the selectivity available, maybe so, but we don't.
> (And even less so if we put this logic in the core code.)
>
> Short of sending a whole second query to the remote server, it's
> not clear to me how we could get the full table size (or equivalently
> the target query's selectivity for that table). The best we realistically
> can do is to adopt pg_class.reltuples if there's been an ANALYZE of
> the foreign table. That case already works (and this proposal doesn't
> break it). The problem is what to do when pg_class.reltuples is zero
> or otherwise badly out-of-date.

In estimate_path_cost_size(), if use_remote_estimate is true, we
adjust the rows estimate returned from the remote server, by factoring
in the selectivity of the locally-checked quals. I thought what I
proposed above would be more consistent with that.

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-07-02 03:09:29 Re: WIP: WAL prefetch (another approach)
Previous Message Kyotaro Horiguchi 2020-07-02 02:14:48 Re: Asynchronous Append on postgres_fdw nodes.