Re: [HACKERS] postgres_fdw bug in 9.6

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] postgres_fdw bug in 9.6
Date: 2018-01-15 21:38:04
Message-ID: 9864.1516052284@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Jan 15, 2018 at 12:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Hm. Simple is certainly good, but if there's multiple rows coming
>> back during an EPQ recheck then I think we have a performance problem.

> You are correct ... I was wrong about that part, and said so in an
> email on this thread sent about 45 minutes before yours. However, I
> still think the patch is a good fix for the immediate issue, unless
> you see some problem with it. It's simple and back-patchable and does
> not preclude further work anybody, including you, might want to do in
> the future.

I spent some more time poking at this. I confirmed that there is *not*
a performance problem: as I'd thought, during an EPQ recheck each base
scan relation will return its single EPQ tuple, without any communication
with the remote server, whether or not the rel was marked FOR UPDATE.

There *is* a problem with GetExistingLocalJoinPath not honoring its
API spec: it will sometimes return join nests that include remote
joins at levels below the top, as I'd speculated to begin with.
This turns out not to be a problem for postgres_fdw, because any
such remote-join node will just recursively fob off EPQ checks
onto its own outerpath, so that eventually you get down to base
scan relations anyway, and everything works. However, because
GetExistingLocalJoinPath is claimed to be a general purpose function
useful for other FDWs, the fact that its API contract is a lie
seems to me to be a problem anyway. Another FDW might have a stronger
dependency on there not being remote sub-joins. (Reviewing the thread
history, I see that Ashutosh agreed this was possible at
<CAFjFpRfYETZLiKERpxCeAN1MsiJypsyNd6x9FQ7OWjY=_TGK2Q(at)mail(dot)gmail(dot)com>
and suggested that we just change the function's commentary to not
make promises it won't keep. Maybe that's enough, but I think just
walking away isn't enough.)

I'm also still pretty unhappy with the amount of useless planning work
caused by doing GetExistingLocalJoinPath during path creation. It strikes
me that we could likely replace the entire thing with some code that just
reconstructs the join node's output tuple during EPQ using the rowmark
data for all the base relations. Outer joins aren't really a problem:
we could tell which relations were replaced by nulls because the rowmark
values that bubbled up to the top went to nulls themselves. However,
that's a nontrivial amount of work and probably wouldn't result in
something we cared to back-patch, especially since it's not really a bug
fix.

Anyway, I agree with you that we know of no observable bug here except
for Jeff's original complaint, and forcing sorting of the EPQ paths
would be enough to fix that.

Your patch seems OK codewise, but I think the comment is not nearly
adequate. Maybe something like "The EPQ path must be at least as well
sorted as the path itself, in case it gets used as input to a mergejoin."

Also, a regression test case would be appropriate perhaps. I tried
to reproduce Jeff's complaint in the postgres_fdw regression database,
and it actually failed on the very first multi-way join I tried:

contrib_regression=# explain select * from ft1,ft2,ft4,ft5 where ft1.c1=ft2.c1 and ft1.c1=ft4.c1 and ft1.c1=ft5.c1 for update;
ERROR: outer pathkeys do not match mergeclauses

Also, for the archives' sake, here's an example showing that
GetExistingLocalJoinPath is not doing what it says on the tin:

contrib_regression=# set from_collapse_limit TO 1;
SET
contrib_regression=# set enable_hashjoin TO 0;
SET
contrib_regression=# set enable_mergejoin TO 0;
SET
contrib_regression=# explain select ft1.c1,ft4.* from ft1,ft2 left join ft4 on ft2.c1=ft4.c1,ft5,loc1 where ft1.c1=ft2.c1 and ft1.c1=ft5.c1 and ft1.c1=loc1.f1 for update of ft1,loc1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
LockRows (cost=105.72..781.42 rows=210 width=286)
-> Nested Loop (cost=105.72..779.32 rows=210 width=286)
Join Filter: (ft2.c1 = loc1.f1)
-> Seq Scan on loc1 (cost=0.00..22.70 rows=1270 width=10)
-> Materialize (cost=105.72..128.06 rows=33 width=179)
-> Foreign Scan (cost=105.72..127.89 rows=33 width=179)
Relations: ((public.ft1) INNER JOIN ((public.ft2) LEFT JOIN (public.ft4))) INNER JOIN (public.ft5)
-> Nested Loop (cost=233.62..721.53 rows=136 width=179)
Join Filter: (ft2.c1 = ft5.c1)
-> Nested Loop (cost=202.12..12631.43 rows=822 width=137)
Join Filter: (ft1.c1 = ft2.c1)
-> Foreign Scan on ft1 (cost=100.00..141.00 rows=1000 width=75)
-> Materialize (cost=102.12..162.48 rows=822 width=83)
-> Foreign Scan (cost=102.12..158.37 rows=822 width=83)
Relations: (public.ft2) LEFT JOIN (public.ft4)
-> Nested Loop Left Join (cost=200.00..856.78 rows=822 width=83)
Join Filter: (ft2.c1 = ft4.c1)
-> Foreign Scan on ft2 (cost=100.00..137.66 rows=822 width=54)
-> Materialize (cost=100.00..102.75 rows=50 width=54)
-> Foreign Scan on ft4 (cost=100.00..102.50 rows=50 width=54)
-> Materialize (cost=100.00..102.16 rows=33 width=43)
-> Foreign Scan on ft5 (cost=100.00..101.99 rows=33 width=43)
(22 rows)

I don't think this latter example needs to become a regression test
in this form. If we wanted to gear up an isolation test that
actually exercised EPQ with postgres_fdw, maybe it'd be useful
in that context, to prove that nested remote joins work for EPQ.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-01-15 22:11:42 Re: [HACKERS] Partition-wise aggregation/grouping
Previous Message David Fetter 2018-01-15 21:32:55 Re: proposal: alternative psql commands quit and exit