Quick Links

Re: Foreign join pushdown vs EvalPlanQual

From:	Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Kohei KaiGai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: Foreign join pushdown vs EvalPlanQual
Date:	2015-10-14 08:31:16
Message-ID:	561E12D4.7040403@lab.ntt.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2015/10/10 10:17, Robert Haas wrote:
> On Thu, Oct 8, 2015 at 11:00 PM, Etsuro Fujita
> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>> The best plan is presumably something like this as you said before:
>>>
>>> LockRows
>>> -> Nested Loop
>>> -> Seq Scan on verysmall v
>>> -> Foreign Scan on bigft1 and bigft2
>>> Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x =
>>> bigft2.x AND bigft1.q = $1 AND bigft2.r = $2
>>>
>>> Consider the EvalPlanQual testing to see if the updated version of a
>>> tuple in v satisfies the query. If we use the column in the testing, we
>>> would get the wrong results in some cases.

>> More precisely, we would get the wrong result when the value of v.q or v.r
>> in the updated version has changed.

> Interesting test case. It's worth considering why this works if you
> were to replace the Foreign Scan with an Index Scan; suppose the query
> is SELECT * FROM verysmall v LEFT JOIN realbiglocaltable t ON v.x =
> t.x FOR UPDATE OF v, so that you get:
>
> LockRows
> -> Nested Loop
> -> Seq Scan on verysmall v
> -> Foreign Scan on realbiglocaltable t
> Index Cond: v.x = t.x
>
> In your example, the remote SQL pushes down certain quals to the
> remote server, and so if we just return the same tuple they might no
> longer be satisfied. In this example, the index qual is essentially a
> filter condition that has been "pushed down" into the index AM. The
> EvalPlanQual machinery prevents this from generating wrong answers by
> rechecking the index cond - see IndexRecheck. Even though it's
> normally the AM's job to enforce the index cond, and the executor does
> not need to recheck, in the EvalPlanQual case it does need to recheck.
>
> I think the foreign data wrapper case should be handled the same way.
> Any condition that we initially pushed down to the foreign server
> needs to be locally rechecked if we're inside EPQ.

Agreed.

As KaiGai-san also pointed out before, I think we should address this in
each of the following cases:

1) remote qual (scanrelid>0)
2) remote join (scanrelid==0)

As for #1, I noticed that there is a bug in handling the same kind of
FDW queries, which will be shown below. As you said, I think this
should be addressed by rechecking the remote quals *locally*. (I
thought another fix for this kind of bug before, though.) IIUC, I think
this should be fixed separately from #2, as this is a bug not only in
9.5, but in back branches. Please find attached a patch.

Create an environment:

mydatabase=# create table t1 (a int primary key, b text);
mydatabase=# insert into t1 select a, 'notsolongtext' from
generate_series(1, 1000000) a;

postgres=# create server myserver foreign data wrapper postgres_fdw
options (dbname 'mydatabase');
postgres=# create user mapping for current_user server myserver;
postgres=# create foreign table ft1 (a int, b text) server myserver
options (table_name 't1');
postgres=# alter foreign table ft1 options (add use_remote_estimate 'true');
postgres=# create table inttab (a int);
postgres=# insert into inttab select a from generate_series(1, 10) a;
postgres=# analyze ft1;
postgres=# analyze inttab;

Run concurrent transactions that produce incorrect result:

[Terminal1]
postgres=# begin;
BEGIN
postgres=# update inttab set a = a + 1 where a = 1;
UPDATE 1

[Terminal2]
postgres=# explain verbose select * from inttab, ft1 where inttab.a =
ft1.a limit 1 for update;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Limit (cost=100.43..198.99 rows=1 width=70)
Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
-> LockRows (cost=100.43..1086.00 rows=10 width=70)
Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
-> Nested Loop (cost=100.43..1085.90 rows=10 width=70)
Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
-> Seq Scan on public.inttab (cost=0.00..1.10 rows=10
width=10)
Output: inttab.a, inttab.ctid
-> Foreign Scan on public.ft1 (cost=100.43..108.47
rows=1 width=18)
Output: ft1.a, ft1.b, ft1.*
Remote SQL: SELECT a, b FROM public.t1 WHERE
(($1::integer = a)) FOR UPDATE
(11 rows)

postgres=# select * from inttab, ft1 where inttab.a = ft1.a limit 1 for
update;

[Terminal1]
postgres=# commit;
COMMIT

[Terminal2]
(After the commit in Terminal1, the following result will be shown in
Terminal2. Note that the values of inttab.a and ft1.a wouldn't satisfy
the remote qual!)
a | a | b
---+---+---------------
2 | 1 | notsolongtext
(1 row)

As for #2, I didn't come up with any solution to locally rechecking
pushed-down join conditions against a joined tuple populated from a
column that we discussed. Instead, I'd like to revise a
local-join-execution-plan-based approach that we discussed before, by
addressing your comments such as [1]. Would it be the right way to go?

Best regards,
Etsuro Fujita

[1]
http://www.postgresql.org/message-id/CA+TgmoaAzs0dR23R7PTBseQfwOtuVCPNBqDHxeBo9Gi+dMxj8w@mail.gmail.com

Attachment	Content-Type	Size
foreign-recheck-for-foreign-table-1.patch	text/x-patch	9.0 KB

In response to

Re: Foreign join pushdown vs EvalPlanQual at 2015-10-10 01:17:42 from Robert Haas

Responses

Re: Foreign join pushdown vs EvalPlanQual at 2015-10-14 08:55:15 from Kyotaro HORIGUCHI
Re: Foreign join pushdown vs EvalPlanQual at 2015-10-14 19:21:41 from Robert Haas
Re: Foreign join pushdown vs EvalPlanQual at 2015-10-16 09:00:45 from Etsuro Fujita

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kyotaro HORIGUCHI	2015-10-14 08:33:01	Re: Dangling Client Backend Process
Previous Message	Shulgin, Oleksandr	2015-10-14 08:26:58	Re: Database schema diff