|From:||Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>|
|Subject:||Problems with plan estimates in postgres_fdw|
|Views:||Raw Message | Whole Thread | Download mbox|
This analysis comes from investigating a report from an IRC user. A
summary of the initial report is:
Using PG 9.6.9 and postgres_fdw, a query of the form "select * from
foreign_table order by col limit 1" is getting a local Sort plan, not
pushing the ORDER BY to the remote. Turning off use_remote_estimates
changes the plan to use a remote sort, with a 10000x speedup.
I don't think this can be called a bug, exactly, and I don't have an
immediate fix, so I'm putting this analysis up for the benefit of anyone
working on this in future.
The cause of the misplan seems to be this: postgres_fdw with
use_remote_estimates on does not attempt to obtain fast-start plans from
the remote. In this case what happens is this:
1. postgres_fdw gets the cost estimate from the plain remote fetch, by
doing "EXPLAIN select * from table". This produces a plan with a low
startup cost (just the constant overhead) and a high total cost (on
the order of 1.2e6 in this case).
2. postgres_fdw gets the cost estimate for the ordered fetch, by doing
"EXPLAIN select * from table order by col". Note that there is no
LIMIT nor any cursor_tuple_fraction in effect, so the plan returned
in this case is a seqscan+sort plan (in spite of the presence of an
index on "col"), with a very high (order of 8e6) startup and total
So when the local side tries to generate paths, it has the choice of
using a remote-ordered path with startup cost 8e6, or a local top-1
sort on top of an unordered remote path, which has a total cost on the
order of 1.5e6 in this case; cheaper than the remote sort because this
only needs to do top-1, while the remote is sorting millions of rows
and would probably spill to disk.
However, when it comes to actual execution, postgres_fdw opens a cursor
for the remote query, which means that cursor_tuple_fraction will come
into play. As far as I can tell, this is not set anywhere, so this means
that the plan that actually gets run on the remote is likely to have
_completely_ different costs from those returned by the EXPLAINs. In
particular, in this case the fast-start index-scan plan for the ORDER BY
remote query is clearly being chosen when use_remote_estimates is off
(since the query completes in 15ms rather than 150 seconds).
One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?
|Next Message||David Rowley||2018-08-02 01:10:57||Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian|
|Previous Message||Tom Lane||2018-08-01 23:48:24||Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian|