Re: Foreign join pushdown vs EvalPlanQual

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp
Cc: robertmhaas(at)gmail(dot)com, kaigai(at)ak(dot)jp(dot)nec(dot)com, pgsql-hackers(at)postgresql(dot)org, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Foreign join pushdown vs EvalPlanQual
Date: 2015-10-14 08:55:15
Message-ID: 20151014.175515.26559651.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ah..

I understood that what you mentioned is the lack of local recheck
of foreigh tuples. Sorry for the noise.

At Wed, 14 Oct 2015 17:31:16 +0900, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <561E12D4(dot)7040403(at)lab(dot)ntt(dot)co(dot)jp>
On 2015/10/10 10:17, Robert Haas wrote:
> > On Thu, Oct 8, 2015 at 11:00 PM, Etsuro Fujita
> > <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >>> The best plan is presumably something like this as you said before:
> >>>
> >>> LockRows
> >>> -> Nested Loop
> >>> -> Seq Scan on verysmall v
> >>> -> Foreign Scan on bigft1 and bigft2
> >>> Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x =
> >>> bigft2.x AND bigft1.q = $1 AND bigft2.r = $2
> >>>
> >>> Consider the EvalPlanQual testing to see if the updated version of a
> >>> tuple in v satisfies the query. If we use the column in the testing,
> >>> we
> >>> would get the wrong results in some cases.
>
> >> More precisely, we would get the wrong result when the value of v.q or
> >> v.r
> >> in the updated version has changed.
>
> > Interesting test case. It's worth considering why this works if you
> > were to replace the Foreign Scan with an Index Scan; suppose the query
> > is SELECT * FROM verysmall v LEFT JOIN realbiglocaltable t ON v.x =
> > t.x FOR UPDATE OF v, so that you get:
> >
> > LockRows
> > -> Nested Loop
> > -> Seq Scan on verysmall v
> > -> Foreign Scan on realbiglocaltable t
> > Index Cond: v.x = t.x
> >
> > In your example, the remote SQL pushes down certain quals to the
> > remote server, and so if we just return the same tuple they might no
> > longer be satisfied. In this example, the index qual is essentially a
> > filter condition that has been "pushed down" into the index AM. The
> > EvalPlanQual machinery prevents this from generating wrong answers by
> > rechecking the index cond - see IndexRecheck. Even though it's
> > normally the AM's job to enforce the index cond, and the executor does
> > not need to recheck, in the EvalPlanQual case it does need to recheck.
> >
> > I think the foreign data wrapper case should be handled the same way.
> > Any condition that we initially pushed down to the foreign server
> > needs to be locally rechecked if we're inside EPQ.
>
> Agreed.
>
> As KaiGai-san also pointed out before, I think we should address this
> in each of the following cases:
>
> 1) remote qual (scanrelid>0)
> 2) remote join (scanrelid==0)
>
> As for #1, I noticed that there is a bug in handling the same kind of
> FDW queries, which will be shown below. As you said, I think this
> should be addressed by rechecking the remote quals *locally*. (I
> thought another fix for this kind of bug before, though.) IIUC, I
> think this should be fixed separately from #2, as this is a bug not
> only in 9.5, but in back branches. Please find attached a patch.
>
> Create an environment:
>
> mydatabase=# create table t1 (a int primary key, b text);
> mydatabase=# insert into t1 select a, 'notsolongtext' from
> generate_series(1, 1000000) a;
>
> postgres=# create server myserver foreign data wrapper postgres_fdw
> options (dbname 'mydatabase');
> postgres=# create user mapping for current_user server myserver;
> postgres=# create foreign table ft1 (a int, b text) server myserver
> options (table_name 't1');
> postgres=# alter foreign table ft1 options (add use_remote_estimate
> 'true');
> postgres=# create table inttab (a int);
> postgres=# insert into inttab select a from generate_series(1, 10) a;
> postgres=# analyze ft1;
> postgres=# analyze inttab;
>
> Run concurrent transactions that produce incorrect result:
>
> [Terminal1]
> postgres=# begin;
> BEGIN
> postgres=# update inttab set a = a + 1 where a = 1;
> UPDATE 1
>
> [Terminal2]
> postgres=# explain verbose select * from inttab, ft1 where inttab.a =
> ft1.a limit 1 for update;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------
> Limit (cost=100.43..198.99 rows=1 width=70)
> Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
> -> LockRows (cost=100.43..1086.00 rows=10 width=70)
> Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
> -> Nested Loop (cost=100.43..1085.90 rows=10 width=70)
> Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
> -> Seq Scan on public.inttab (cost=0.00..1.10 rows=10
> -> width=10)
> Output: inttab.a, inttab.ctid
> -> Foreign Scan on public.ft1 (cost=100.43..108.47 rows=1
> -> width=18)
> Output: ft1.a, ft1.b, ft1.*
> Remote SQL: SELECT a, b FROM public.t1 WHERE
> (($1::integer = a)) FOR UPDATE
> (11 rows)
>
> postgres=# select * from inttab, ft1 where inttab.a = ft1.a limit 1
> for update;
>
> [Terminal1]
> postgres=# commit;
> COMMIT
>
> [Terminal2]
> (After the commit in Terminal1, the following result will be shown in
> Terminal2. Note that the values of inttab.a and ft1.a wouldn't
> satisfy the remote qual!)
> a | a | b
> ---+---+---------------
> 2 | 1 | notsolongtext
> (1 row)
>
> As for #2, I didn't come up with any solution to locally rechecking
> pushed-down join conditions against a joined tuple populated from a
> column that we discussed. Instead, I'd like to revise a
> local-join-execution-plan-based approach that we discussed before, by
> addressing your comments such as [1]. Would it be the right way to
> go?
>
> Best regards,
> Etsuro Fujita
>
> [1]
> http://www.postgresql.org/message-id/CA+TgmoaAzs0dR23R7PTBseQfwOtuVCPNBqDHxeBo9Gi+dMxj8w@mail.gmail.com

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Beena Emerson 2015-10-14 08:58:13 Re: Support for N synchronous standby servers - take 2
Previous Message Etsuro Fujita 2015-10-14 08:53:29 Re: Foreign join pushdown vs EvalPlanQual