Re: Postgres_fdw join pushdown - wrong results with whole-row reference

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgres_fdw join pushdown - wrong results with whole-row reference
Date: 2016-06-22 08:56:05
Message-ID: a1fa1c4c-bf96-8ea5-cff5-85b927298e73@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016/06/22 17:11, Amit Langote wrote:
> I wonder whether such a whole-row-var would arise from the nullable side
> of a join? I guess not. Not that I'm saying we shouldn't account for that
> case at all since any and every whole-row-var in the targetlist currently
> gets that treatment, even those that are known non-nullable. Couldn't we
> have prevented the latter somehow? IOW, only generate the CASE WHEN when
> a Var being deparsed is known nullable as the comment there says:
>
> deparse.c:
>
> 1639 /*
> 1640 * In case the whole-row reference is under an outer join then it has
> 1641 * to go NULL whenver the rest of the row goes NULL. Deparsing a join
> 1642 * query would always involve multiple relations, thus qualify_col
> 1643 * would be true.
> 1644 */
> 1645 if (qualify_col)
> 1646 {
> 1647 appendStringInfoString(buf, "CASE WHEN");
> 1648 ADD_REL_QUALIFIER(buf, varno);
> 1649 appendStringInfo(buf, "* IS NOT NULL THEN ");
> 1650 }

I think we could address this in another way once we support deparsing
subqueries; rewrite the remote query into something that wouldn't need
the CASE WHEN conversion. For example, we currently have:

postgres=# explain verbose select ft2 from ft1 left join ft2 on ft1.a =
ft2.a;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..110.04 rows=1 width=32)
Output: ft2.*
Relations: (public.ft1) LEFT JOIN (public.ft2)
Remote SQL: SELECT CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, r2.b)
END FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = r2.a))))
(4 rows)

However, if we support deparsing subqueries, the remote query in the
above example could be rewritten into something like this:

SELECT ss.c2 FROM t1 LEFT JOIN (SELECT t2.a, ROW(a, b) FROM t2) ss(c1,
c2) ON (t1.a = ss.c1);

So we would no longer need "CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a,
r2.b) END" in the target list in the remote query.

For the CASE WHEN conversion for a system column other than ctid, we
could also address this by replacing the whole-row reference in the IS
NOT NULL condition in that conversion with the system column reference.

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-06-22 09:10:45 Re: Hash Indexes
Previous Message Amit Langote 2016-06-22 08:11:41 Re: Postgres_fdw join pushdown - wrong results with whole-row reference