Re: Oddity in EXPLAIN for foreign/custom join pushdown plans

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Oddity in EXPLAIN for foreign/custom join pushdown plans
Date: 2016-07-28 01:01:41
Message-ID: 9A28C8860F777E439AA12E8AEA7694F80120E5B9@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 2016/07/27 13:51, Kouhei Kaigai wrote:
> > This output is, at least, not incorrect.
> > This ForeignScan actually scan a relation that consists of two joined
> > tables on the remote side. So, not incorrect, but may not convenient for
> > better understanding by users who don't have deep internal knowledge.
>
> Hmm.
>
> > On the other hands, I cannot be happy with your patch because it concludes
> > the role of ForeignScan/CustomScan with scanrelid==0 is always join.
> > However, it does not cover all the case.
> >
> > When postgres_fdw gets support of remote partial aggregate, we can implement
> > the feature using ForeignScan with scanrelid==0. Is it a join? No.
>
> Yeah, I think that that could be implemented in both cases (scanrelid>0
> and scanrelid=0).
>
> > Probably, the core cannot know exact purpose of ForeignScan/CustomScan with
> > scanrelid==0. It can be a remote partial aggregation. It can be an alternative
> > sort logic by GPU. It can be something others.
> > So, I think extension needs to inform the core more proper label to show;
> > including a flag to control whether the relation(s) name shall be attached
> > next to the ForeignScan/CustomScan line.
> >
> > I'd like you to suggest to add two fields below:
> > - An alternative label extension wants to show instead of the default one
>
> How about adding something like the "Additional Operations" line as
> proposed in a previous email, instead? As you know, FDWs/Extensions
> could add such information through ExplainForeignScan/ExplainCustomScan.
>
No. What I'm saying is here:

EXPLAIN (COSTS false, VERBOSE)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY
t1.c3, t1.c1 OFFSET 100 LIMIT 10;
QUERY PLAN
-------------------------------------------
Limit
Output: t1.c1, t2.c1, t1.c3
-> Foreign Scan
^^^^
Output: t1.c1, t2.c1, t1.c3
Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T
1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1")))) ORDER BY
r1.c3 ASC N\
ULLS LAST, r1."C 1" ASC NULLS LAST
(6 rows)

Your concern is that EXPLAIN shows ForeignScan node as "Foreign Scan"
although it actually works as "Foreign Join".

My suggestion makes EXPLAIN print "Foreign %s" according to the label
assigned by the extension. Only FDW driver knows how this ForeignScan
node actually performs on, thus, only FDW driver can set meaningful label.

This label may be "Join", may be "Partial Aggregate + Join", or something
others according to the actual job of this node.

> > - A flag to turn on/off printing relation(s) name
>
> ISTM that the relation information should be always printed even in the
> case of scanrelid=0 by the core, because that would be essential for
> analyzing EXPLAIN results.
>
We have no information which relations are actually scanned by ForeignScan
and CustomScan. Your patch abuses fs_relids and custom_relids, however,
these fields don't indicate the relations actually scan by these nodes.
It just tracks relations processed by this node or its children.

See the example below. This GpuJoin on CustomScan takes three SeqScan nodes
as inner/outer source of relations joins. GpuJoin never scans the tables
actually. It picks up the records generated by underlying SeqScan nodes.

postgres=# explain select id from t0 natural join t1 natural join t2;
QUERY PLAN
---------------------------------------------------------------------------
Custom Scan (GpuJoin) (cost=12385.67..291245.35 rows=9815465 width=4)
GPU Projection: t0.id
Depth 1: GpuHashJoin, HashKeys: (t0.bid)
JoinQuals: (t0.bid = t2.bid)
Nrows (in/out: 98.15%), KDS-Hash (size: 13.47MB, nbatches: 1)
Depth 2: GpuHashJoin, HashKeys: (t0.aid)
JoinQuals: (t0.aid = t1.aid)
Nrows (in/out: 100.00%), KDS-Hash (size: 13.47MB, nbatches: 1)
-> Seq Scan on t0 (cost=0.00..183333.96 rows=9999996 width=12)
-> Seq Scan on t2 (cost=0.00..1935.00 rows=100000 width=4)
-> Seq Scan on t1 (cost=0.00..1935.00 rows=100000 width=4)
(11 rows)

If EXPLAIN command attaches something like "on t0,t1,t2" after the GpuJoin
*with no choice*, it is problematic and misleading.
It shall be controllable by the extension which knows what tables are
actually scanned. (Please note that I never says extension makes the string.
It is a job of core explain. I suggest it needs to be controllable.)

Even though I recommended a boolean flag to turn on/off this print, it may
need to be a bitmap to indicate which underlying relations should be printed
by the explain command. Because we can easily assume ForeignScan/CustomScan
node that scans only a part of child tables. For example, it is available to
implement GpuJoin scans the t1 and t2 by itself but takes SeqScan on t0.

> > - A flag to turn on/off printing relation(s) name

So, my second suggestion shall be adjusted as follows:
- A bitmap to indicate which relation(s) name shall be printed by the core
explain command.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-07-28 01:22:17 Re: A Modest Upgrade Proposal
Previous Message Andres Freund 2016-07-27 23:26:41 Re: old_snapshot_threshold allows heap:toast disagreement