Re: [HACKERS] postgres_fdw bug in 9.6

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] postgres_fdw bug in 9.6
Date: 2018-01-12 21:07:09
Message-ID: CA+TgmoZ5pEQkYTyFRRS9Sfw1Dhy=R6U2BNk3WQz_NPwp5jr-HA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Sun, Dec 3, 2017 at 2:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I thought some more about this. While it seems clear that we don't
> actually need to recheck anything within a postgres_fdw foreign join,
> there's still a problem: we have to be able to regurgitate the join
> row during postgresRecheckForeignScan. Since, in the current system
> design, the only available data is scan-level rowmarks (that is,
> whole-row values from each base foreign relation), there isn't any
> very good way to produce the join row except to insert the rowmark
> values into dummy scan nodes and then re-execute the join locally.
> So really, that is what the outerPlan infrastructure is doing for us.

I started to look at this patch again today and got cold feet. It
seems to me that this entire design on the posted patch is based on
your remarks in http://postgr.es/m/13242.1481582736@sss.pgh.pa.us --

# One way that we could make things better is to rely on the knowledge
# that EPQ isn't asked to evaluate joins for more than one row per input
# relation, and therefore using merge or hash join technology is really
# overkill. We could make a tree of simple nestloop joins, which aren't
# going to care about sort order, if we could identify the correct join
# clauses to apply.

However, those remarks are almost entirely wrong. It's true that the
relations for which we have EPQ tuples will only ever output a single
tuple, but because of what you said in the quoted text above, this
does NOT imply that the recheck plan never needs to worry about
dealing with a large number of rows, as the following example shows.
(The part about making a tree of simple nestloop joins is wrong not
only for this reason but because it won't work in the executor, as
previously discussed.)

rhaas=# explain select * from foo join (bar join baz on bar.a = baz.a)
on foo.a = bar.a where foo.b = 'hello' for update;
QUERY PLAN
-----------------------------------------------------------------------------------------------
LockRows (cost=109.02..121.46 rows=1 width=162)
-> Merge Join (cost=109.02..121.45 rows=1 width=162)
Merge Cond: (bar.a = foo.a)
-> Foreign Scan (cost=100.58..4914.58 rows=40000 width=119)
Relations: (public.bar) INNER JOIN (public.baz)
-> Hash Join (cost=2556.00..4962.00 rows=40000 width=119)
Hash Cond: (bar.a = baz.a)
-> Foreign Scan on bar (cost=100.00..1956.00
rows=40000 width=28)
-> Hash (cost=1956.00..1956.00 rows=40000 width=27)
-> Foreign Scan on baz
(cost=100.00..1956.00 rows=40000 width=27)
-> Sort (cost=8.44..8.45 rows=1 width=28)
Sort Key: foo.a
-> Index Scan using foo_b_idx on foo (cost=0.41..8.43
rows=1 width=28)
Index Cond: (b = 'hello'::text)

There's really nothing to prevent the bar/baz join from producing
arbitrarily many rows, which means that it's not good to turn the
recheck plan into a Nested Loop unconditionally -- and come to think
of it, I'm pretty sure this kind of case is why
GetExistingLocalJoinPath() tries to using an existing path: it wants a
path that isn't going suck if it gets executed.

After some thought, it seems that there's a much simpler way that we
could fix the problem you identified in that original email: if the
EPQ path isn't properly sorted, have postgres_fdw's
add_paths_with_pathkeys_for_rel stick a Sort node on top of it. I
tried this and it does indeed fix Jeff Janes' initial test case.
Patch attached. One could do better here if one wanted to revise
GetExistingLocalJoinPath() to take Last *pathkeys as an additional
argument; we could then try to find the optimal EPQ path for each
possible set of sortkeys. But I don't feel the need to work on that
right now, and it would be nicer not to back-patch a change to the
signature of GetExistingLocalJoinPath(), as has already been noted.

It is altogether possible that this doesn't fix every issue with the
GetExistingLocalJoinPath interface, and perhaps that is due for a
broader rewrite. However, I think that more recent discussion has
cast some doubt on the fiery criticism of that previous note. That
note asserted that we'd have trouble with join nests containing more
than 2 joins, but as was pointed out at the time, that's not really
true, because if we happen to pull a join out of the pathlist that has
ForeignScan children, we'll fetch out the EPQ-recheck subpaths that
they created and use those instead. And there are other problems with
the analysis in that email too as noted above. The only problem that
*has actually been demonstrated* is that we can end up using as an EPQ
path something that doesn't generate the right sort order, and that is
easily fixed by making it do so, which the attached patch does. So I
propose we commit and back-patch this and move on.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
sort-epq-path-if-needed.patch application/octet-stream 14.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Rofail 2018-01-12 21:19:29 Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Previous Message John Naylor 2018-01-12 20:54:57 Re: WIP: a way forward on bootstrap data