Re: Problems with plan estimates in postgres_fdw

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problems with plan estimates in postgres_fdw
Date: 2018-12-17 13:09:34
Message-ID: 5C17A00E.5000005@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

(2018/10/09 14:48), Etsuro Fujita wrote:
> (2018/10/05 19:15), Etsuro Fujita wrote:
>> (2018/08/02 23:41), Tom Lane wrote:
>>> Andrew Gierth<andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
>>>> [ postgres_fdw is not smart about exploiting fast-start plans ]
>>>
>>> Yeah, that's basically not accounted for at all in the current design.
>>>
>>>> One possibility: would it be worth adding an option to EXPLAIN that
>>>> makes it assume cursor_tuple_fraction?
>>>
>>> [ handwaving ahead ]
>>>
>>> I wonder whether it could be done without destroying postgres_fdw's
>>> support for old servers, by instead including a LIMIT in the query sent
>>> for explaining. The trick would be to know what value to put as the
>>> limit, though. It'd be easy to do if we were willing to explain the
>>> query
>>> twice (the second time with a limit chosen as a fraction of the rowcount
>>> seen the first time), but man that's an expensive solution.
>>>
>>> Another component of any real fix here would be to issue "SET
>>> cursor_tuple_fraction" before opening the execution cursor, so as to
>>> ensure that we actually get an appropriate plan on the remote side.
>>>
>>> If we could tell whether there's going to be any use in fast-start
>>> plans,
>>> it might make sense to build two scan paths for a foreign table, one
>>> based
>>> on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still
>>> means two explain requests, which is why I'm not thrilled about doing it
>>> unless there's a high probability of the extra explain being useful.
>>
>> Agreed, but ISTM that to address the original issue, it would be enough
>> to jsut add LIMIT (or ORDER BY LIMIT) pushdown to postgres_fdw based on
>> the upper-planner-pathification work.
>
> Will work on it unless somebody else wants to.

Here is a set of WIP patches for pushing down ORDER BY LIMIT to the remote:

* 0001-postgres-fdw-upperrel-ordered-WIP.patch:
This patch performs the UPPERREL_ORDERED step remotely.

* 0002-postgres-fdw-upperrel-final-WIP.patch:
This patch performs the UPPERREL_FINAL step remotely. Currently, this
only supports for SELECT commands, and pushes down LIMIT/OFFSET to the
remote if possible. This also removes LockRows if there is a FOR
UPDATE/SHARE clause, which would be safe because postgres_fdw performs
early locking. I'd like to leave INSERT, UPDATE and DELETE cases for
future work. It is my long-term todo to rewrite PlanDirectModify using
the upper-planner-pathification work. :)

For some regression test cases with ORDER BY and/or LIMIT, I noticed
that these patches still cannot push down those clause to the remote. I
guess it would be needed to tweak the cost/size estimation added by
these patches, but I didn't look at that in detail yet. Maybe I'm
missing something, though.

Comments welcome!

Best regards,
Etsuro Fujita

Attachment Content-Type Size
0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-WIP.patch text/x-patch 40.9 KB
0002-postgres_fdw-Perform-UPPERREL_FINAL-step-remotely-WIP.patch text/x-patch 95.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Kukushkin 2018-12-17 13:10:19 Re: Connection slots reserved for replication
Previous Message Alexander Korotkov 2018-12-17 12:35:52 Re: gist microvacuum doesn't appear to care about hot standby?