Problems with plan estimates in postgres_fdw

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Problems with plan estimates in postgres_fdw
Date: 2018-08-02 00:06:41
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

This analysis comes from investigating a report from an IRC user. A
summary of the initial report is:

Using PG 9.6.9 and postgres_fdw, a query of the form "select * from
foreign_table order by col limit 1" is getting a local Sort plan, not
pushing the ORDER BY to the remote. Turning off use_remote_estimates
changes the plan to use a remote sort, with a 10000x speedup.

I don't think this can be called a bug, exactly, and I don't have an
immediate fix, so I'm putting this analysis up for the benefit of anyone
working on this in future.

The cause of the misplan seems to be this: postgres_fdw with
use_remote_estimates on does not attempt to obtain fast-start plans from
the remote. In this case what happens is this:

1. postgres_fdw gets the cost estimate from the plain remote fetch, by
doing "EXPLAIN select * from table". This produces a plan with a low
startup cost (just the constant overhead) and a high total cost (on
the order of 1.2e6 in this case).

2. postgres_fdw gets the cost estimate for the ordered fetch, by doing
"EXPLAIN select * from table order by col". Note that there is no
LIMIT nor any cursor_tuple_fraction in effect, so the plan returned
in this case is a seqscan+sort plan (in spite of the presence of an
index on "col"), with a very high (order of 8e6) startup and total

So when the local side tries to generate paths, it has the choice of
using a remote-ordered path with startup cost 8e6, or a local top-1
sort on top of an unordered remote path, which has a total cost on the
order of 1.5e6 in this case; cheaper than the remote sort because this
only needs to do top-1, while the remote is sorting millions of rows
and would probably spill to disk.

However, when it comes to actual execution, postgres_fdw opens a cursor
for the remote query, which means that cursor_tuple_fraction will come
into play. As far as I can tell, this is not set anywhere, so this means
that the plan that actually gets run on the remote is likely to have
_completely_ different costs from those returned by the EXPLAINs. In
particular, in this case the fast-start index-scan plan for the ORDER BY
remote query is clearly being chosen when use_remote_estimates is off
(since the query completes in 15ms rather than 150 seconds).

One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?

Andrew (irc:RhodiumToad)


Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-08-02 01:10:57 Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian
Previous Message Tom Lane 2018-08-01 23:48:24 Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian