Re: Join push-down support for foreign tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Join push-down support for foreign tables
Date: 2015-03-04 10:17:02
Message-ID: CAFjFpRcs3FLvtqbaRqL6r3cksHHhXAXPZTkmjqJx4=oyfbBp9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Hanada-san,
I am looking at the patch. Here are my comments

In create_foreignscan_path() we have lines like -
1587 pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
1588
required_outer);
Now, that the same function is being used for creating foreign scan paths
for joins, we should be calling get_joinrel_parampathinfo() on a join rel
and get_baserel_parampathinfo() on base rel.

The patch seems to handle all the restriction clauses in the same way.
There are two kinds of restriction clauses - a. join quals (specified using
ON clause; optimizer might move them to the other class if that doesn't
affect correctness) and b. quals on join relation (specified in the WHERE
clause, optimizer might move them to the other class if that doesn't affect
correctness). The quals in "a" are applied while the join is being computed
whereas those in "b" are applied after the join is computed. For example,
postgres=# select * from lt;
val | val2
-----+------
1 | 2
1 | 3
(2 rows)

postgres=# select * from lt2;
val | val2
-----+------
1 | 2
(1 row)

postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2);
val | val2 | val | val2
-----+------+-----+------
1 | 2 | 1 | 2
1 | 3 | |
(2 rows)

postgres=# select * from lt left join lt2 on (true) where (lt.val2 =
lt2.val2);
val | val2 | val | val2
-----+------+-----+------
1 | 2 | 1 | 2
(1 row)

The difference between these two kinds is evident in case of outer joins,
for inner join optimizer puts all of them in class "b". The remote query
sent to the foreign server has all those in ON clause. Consider foreign
tables ft1 and ft2 pointing to local tables on the same server.
postgres=# \d ft1
Foreign table "public.ft1"
Column | Type | Modifiers | FDW Options
--------+---------+-----------+-------------
val | integer | |
val2 | integer | |
Server: loopback
FDW Options: (table_name 'lt')

postgres=# \d ft2
Foreign table "public.ft2"
Column | Type | Modifiers | FDW Options
--------+---------+-----------+-------------
val | integer | |
val2 | integer | |
Server: loopback
FDW Options: (table_name 'lt2')

postgres=# explain verbose select * from ft1 left join ft2 on (ft1.val2 =
ft2.val2) where ft1.val + ft2.val > ft1.val2 or ft2.val is null;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------
Foreign Scan (cost=100.00..125.60 rows=2560 width=16)
Output: val, val2, val, val2
Remote SQL: SELECT r.a_0, r.a_1, l.a_0, l.a_1 FROM (SELECT val, val2
FROM public.lt2) l (a_0, a_1) RIGHT JOIN (SELECT val, val2 FROM public.lt)
r (a
_0, a_1) ON ((((r.a_0 + l.a_0) > r.a_1) OR (l.a_0 IS NULL))) AND ((r.a_1 =
l.a_1))
(3 rows)

The result is then wrong
postgres=# select * from ft1 left join ft2 on (ft1.val2 = ft2.val2) where
ft1.val + ft2.val > ft1.val2 or ft2.val is null;
val | val2 | val | val2
-----+------+-----+------
1 | 2 | |
1 | 3 | |
(2 rows)

which should match the result obtained by substituting local tables for
foreign ones
postgres=# select * from lt left join lt2 on (lt.val2 = lt2.val2) where
lt.val + lt2.val > lt.val2 or lt2.val is null;
val | val2 | val | val2
-----+------+-----+------
1 | 3 | |
(1 row)

Once we start distinguishing the two kinds of quals, there is some
optimization possible. For pushing down a join it's essential that all the
quals in "a" are safe to be pushed down. But a join can be pushed down,
even if quals in "a" are not safe to be pushed down. But more clauses one
pushed down to foreign server, lesser are the rows fetched from the foreign
server. In postgresGetForeignJoinPath, instead of checking all the
restriction clauses to be safe to be pushed down, we need to check only
those which are join quals (class "a").

Following EXPLAIN output seems to be confusing
ft1 and ft2 both are pointing to same lt on a foreign server.
postgres=# explain verbose select ft1.val + ft1.val2 from ft1, ft2 where
ft1.val + ft1.val2 = ft2.val;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------
Foreign Scan (cost=100.00..132.00 rows=2560 width=8)
Output: (val + val2)
Remote SQL: SELECT r.a_0, r.a_1 FROM (SELECT val, NULL FROM public.lt) l
(a_0, a_1) INNER JOIN (SELECT val, val2 FROM public.lt) r (a_0, a_1) ON ((
(r.a_0 + r.a_1) = l.a_0))

Output just specified val + val2, it doesn't tell, where those val and val2
come from, neither it's evident from the rest of the context.

On Mon, Mar 2, 2015 at 6:18 PM, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
wrote:

> Attached is the revised/rebased version of the $SUBJECT.
>
> This patch is based on Kaigai-san's custom/foreign join patch, so
> please apply it before this patch. In this version I changed some
> points from original postgres_fdw.
>
> 1) Disabled SELECT clause optimization
> ~9.4 postgres_fdw lists only columns actually used in SELECT clause,
> but AFAIS it makes SQL generation complex. So I disabled such
> optimization and put "NULL" for unnecessary columns in SELECT clause
> of remote query.
>
> 2) Extended deparse context
> To allow deparsing based on multiple source relations, I added some
> members to context structure. They are unnecessary for simple query
> with single foreign table, but IMO it should be integrated.
>
> With Kaigai-san's advise, changes for supporting foreign join on
> postgres_fdw is minimized into postgres_fdw itself. But I added new
> FDW API named GetForeignJoinPaths() to keep the policy that all
> interface between core and FDW should be in FdwRoutine, instead of
> using hook function. Now I'm writing document about it, and will post
> it in a day.
>
> 2015-02-19 16:19 GMT+09:00 Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>:
> > 2015-02-17 10:39 GMT+09:00 Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>:
> >> Let me put some comments in addition to where you're checking now.
> >>
> >> [design issues]
> >> * Cost estimation
> >> Estimation and evaluation of cost for remote join query is not an
> >> obvious issue. In principle, local side cannot determine the cost
> >> to run remote join without remote EXPLAIN, because local side has
> >> no information about JOIN logic applied on the remote side.
> >> Probably, we have to put an assumption for remote join algorithm,
> >> because local planner has no idea about remote planner's choice
> >> unless foreign-join don't take "use_remote_estimate".
> >> I think, it is reasonable assumption (even if it is incorrect) to
> >> calculate remote join cost based on local hash-join algorithm.
> >> If user wants more correct estimation, remote EXPLAIN will make
> >> more reliable cost estimation.
> >
> > Hm, I guess that you chose hash-join as "least-costed join". In the
> > pgbench model, most combination between two tables generate hash join
> > as cheapest path. Remote EXPLAIN is very expensive in the context of
> > planning, so it would easily make the plan optimization meaningless.
> > But giving an option to users is good, I agree.
> >
> >>
> >> It also needs a consensus whether cost for remote CPU execution is
> >> equivalent to local CPU. If we think local CPU is rare resource
> >> than remote one, a discount rate will make planner more preferable
> >> to choose remote join than local one
> >
> > Something like cpu_cost_ratio as a new server-level FDW option?
> >
> >>
> >> Once we assume a join algorithm for remote join, unit cost for
> >> remote CPU, we can calculate a cost for foreign join based on
> >> the local join logic plus cost for network translation (maybe
> >> fdw_tuple_cost?).
> >
> > Yes, sum of these costs is the total cost of a remote join.
> > o fdw_startup_cost
> > o hash-join cost, estimated as a local join
> > o fdw_tuple_cost * rows * width
> >
> >> * FDW options
> >> Unlike table scan, FDW options we should refer is unclear.
> >> Table level FDW options are associated with a foreign table as
> >> literal. I think we have two options here:
> >> 1. Foreign-join refers FDW options for foreign-server, but ones
> >> for foreign-tables are ignored.
> >> 2. Foreign-join is prohibited when both of relations don't have
> >> identical FDW options.
> >> My preference is 2. Even though N-way foreign join, it ensures
> >> all the tables involved with (N-1)-way foreign join has identical
> >> FDW options, thus it leads we can make N-way foreign join with
> >> all identical FDW options.
> >> One exception is "updatable" flag of postgres_fdw. It does not
> >> make sense on remote join, so I think mixture of updatable and
> >> non-updatable foreign tables should be admitted, however, it is
> >> a decision by FDW driver.
> >>
> >> Probably, above points need to take time for getting consensus.
> >> I'd like to see your opinion prior to editing your patch.
> >
> > postgres_fdw can't push down a join which contains foreign tables on
> > multiple servers, so use_remote_estimate and fdw_startup_cost are the
> > only FDW options to consider. So we have options for each option.
> >
> > 1-a. If all foreign tables in the join has identical
> > use_remote_estimate, allow pushing down.
> > 1-b. If any of foreign table in the join has true as
> > use_remote_estimate, use remote estimate.
> >
> > 2-a. If all foreign tables in the join has identical fdw_startup_cost,
> > allow pushing down.
> > 2-b. Always use max value in the join. (cost would be more expensive)
> > 2-c. Always use min value in the join. (cost would be cheaper)
> >
> > I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have
> > reasonable cost about startup.
> >
> > I agree about "updatable" option.
> >
> >>
> >> [implementation issues]
> >> The interface does not intend to add new Path/Plan type for each scan
> >> that replaces foreign joins. What postgres_fdw should do is, adding
> >> ForeignPath towards a particular joinrel, then it populates ForeignScan
> >> with remote join query once it got chosen by the planner.
> >
> > That idea is interesting, and make many things simpler. Please let me
> consider.
> >
> >>
> >> A few functions added in src/backend/foreign/foreign.c are not
> >> called by anywhere, at this moment.
> >>
> >> create_plan_recurse() is reverted to static. It is needed for custom-
> >> join enhancement, if no other infrastructure can support.
> >
> > I made it back to static because I thought that create_plan_recurse
> > can be called by core before giving control to FDWs. But I'm not sure
> > it can be applied to custom scans. I'll recheck that part.
> >
> >
> > --
> > Shigeru HANADA
>
>
>
> --
> Shigeru HANADA
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Borodin 2015-03-04 10:53:47 Re: pg_upgrade and rsync
Previous Message Jeevan Chalke 2015-03-04 09:43:40 Re: How about to have relnamespace and relrole?