Re: Ideas about a better API for postgres_fdw remote estimates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Ideas about a better API for postgres_fdw remote estimates
Date: 2020-07-06 15:05:46
Message-ID: 1374896.1594047946@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>> Per postgres_fdw's get_remote_estimate(), the only data we use right now
>> is the startup_cost, total_cost, rows and width estimates from the
>> top-level Plan node. That's available immediately from the Plan tree,
>> meaning that basically *nothing* of the substantial display effort
>> expended by explain.c and ruleutils.c is of any value. So the level-zero

> The 'display effort' you're referring to, when using JSON format with
> explain, is basically to format the results into JSON and return them-
> which is what you're suggesting this mode would do anyway, no..?

Not hardly. Spend some time studying ruleutils.c sometime ---
reverse-compiling a plan is *expensive*. For instance, we have to
look up the names of all the operators used in the query quals,
decide what needs quoting, decide what needs parenthesization, etc.
There's also a fun little bit that assigns unique aliases to each
table appearing in the query, which from memory is at least O(N^2)
and maybe worse. (Admittedly, shipped queries are usually not so
complicated that N would be large.) And by the way, we're also
starting up the executor, even if you didn't say ANALYZE.

A little bit of fooling with "perf" suggests that when explaining
a pretty simple bitmapscan query --- I used
EXPLAIN SELECT * FROM tenk1 WHERE unique1 > 9995
which ought to be somewhat representative of what postgres_fdw needs
--- only about half of the runtime is spent within pg_plan_query, and
the other half is spent on explain.c + ruleutils.c formatting work.
So while getting rid of that overhead wouldn't be an earthshattering
improvement, I think it'd be worthwhile.

>> Further down the road, we might want to rethink the whole idea of
>> completely constructing a concrete Plan. We could get the data we need
>> at the list-of-Paths stage. Even more interesting, we could (with very
>> little more work) return data about multiple Paths, so that the client
>> could find out, for example, the costs of sorted and unsorted output
>> without paying two network round trips to discover that.

> I have to admit that I'm not really sure how we could make it work, but
> having a way to get multiple paths returned by EXPLAIN would certainly
> be interesting to a lot of users. Certainly it's easier to see how we
> could get at that info in a postgres_fdw-specific function, and be able
> to understand how to deal with it there and what could be done, but once
> it's there I wonder if other tools might see that and possibly even
> build on it because it'd be the only way to get that kind of info, which
> certainly wouldn't be ideal.

Yeah, thinking about it as a function that inspects partial planner
results, it might be useful for other purposes besides postgres_fdw.
As I said before, I don't think this necessarily has to be bundled as
part of postgres_fdw. That still doesn't make it part of EXPLAIN.

> That postgres_fdw is an extension is almost as much of a wart as
> anything being discussed here and suggesting that things added to
> postgres_fdw aren't 'core features' seems akin to ignoring the forest
> for the trees-

I think we just had this discussion in another thread. The fact that
postgres_fdw is an extension is a feature, not a bug, because (a) it
means that somebody could implement their own version if they wanted
it to act differently; and (b) it keeps us honest about whether the
APIs needed by an FDW are accessible from outside core. I think moving
postgres_fdw into core would be a large step backwards.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-07-06 15:23:22 bad status of FETCH PERCENT in commitfest application
Previous Message Justin Pryzby 2020-07-06 14:59:47 Re: Proposal: Automatic partition creation