Re: [HACKERS] postgres_fdw bug in 9.6

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] postgres_fdw bug in 9.6
Date: 2017-11-29 22:32:02
Message-ID: 27145.1511994722@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> writes:
> On Tue, Nov 28, 2017 at 11:05 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Tue, Nov 28, 2017 at 11:21 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> In short, we should get rid of all of this expensive and broken logic and
>>> just make EPQ recheck on a foreign join be a no-op, just as it is for a
>>> foreign base table.

> IIUC, the problem here is not about stale foreign tuples (or whether
> they can change) - as you described above they can not, since we have
> requested FOR UPDATE. But when a foreign join is further joined with a
> local table, tuples in which change after they are joined, we need to
> make sure that the total join tuple (local table joined with foreign
> join) still qualifies. We do this by locally joining those rows again.
> [1] is where the discussion started
> [1] https://www.postgresql.org/message-id/558A18B3.9050201@lab.ntt.co.jp

AFAICT, [1] just demonstrates that nobody had thought about the scenario
up to that point, not that the subsequently chosen solution was a good
one. In that example,

LockRows (cost=100.00..101.18 rows=4 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
-> Nested Loop (cost=100.00..101.14 rows=4 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
Join Filter: (foo.a = tab.a)
-> Seq Scan on public.tab (cost=0.00..1.01 rows=1 width=14)
Output: tab.a, tab.b, tab.ctid
-> Foreign Scan (cost=100.00..100.08 rows=4 width=64)
Output: foo.*, foo.a, bar.*, bar.a
Relations: (public.foo) INNER JOIN (public.bar)
Remote SQL: SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT ROW(l.a9), l.a9 FROM (SELECT a a9 FROM public.foo FOR UPDATE) l) l (a1, a2) INNER JOIN (SELECT ROW(r.a9), r.a9 FROM (SELECT a a9 FROM public.bar FOR UPDATE) r) r (a1, a2) ON ((l.a2 = r.a2))

the output of the foreign join cannot change during EPQ, since the remote
server already locked the rows before returning them. The only thing that
can change is the output of the local scan on public.tab. Yes, we then
need to re-verify that foo.a = tab.a ... but in this example, that's the
responsibility of the NestLoop plan node, not the foreign join.

If the topmost join were done differently, passing tab.a down as a
parameter into the foreign join, then we would have a situation where
there was something that needed to be rechecked within the foreign join.
But that would require having made a parameterized path for the foreign
join, which postgres_fdw doesn't do and lacks a lot of other
infrastructure for besides this, I believe.

Moreover, even if we had parameterized foreign join paths, I'm not sure
that it makes sense to build the amount of plan infrastructure involved
in the current implementation just to recheck the pushed-down qual(s).
That could be done a whole lot more cheaply, not only in terms of the
actual EPQ execution (which maybe you could argue isn't performance
critical), but also in terms of the planner effort and executor startup
effort involved to create all those subsidiary plan nodes.

I regret not having paid more attention to the foreign join stuff, because
maybe that could have saved you guys a lot of work ... but I really feel
like there was a lot of misguided work here.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-11-29 22:33:55 Re: [HACKERS] static assertions in C++
Previous Message Robert Haas 2017-11-29 22:15:24 Re: [HACKERS] path toward faster partition pruning