Re: Getting sorted data from foreign server for merge join

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Getting sorted data from foreign server for merge join
Date: 2015-12-17 08:32:17
Message-ID: CAFjFpRc_AO4z+V3SfsMp2gmEMaTUSXh0EjEWktTSqcBtSC5f6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
> wrote:
> > Thanks Ashutosh.
> >
> > Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
> > looks good to me.
>
> This patch needs a rebase.
>

Done.

>
> It's not going to work to say this is a patch proposed for commit when
> it's still got a TODO comment in it that obviously needs to be
> changed. And the formatting of that long comment is pretty weird,
> too, and not consistent with other functions in that same file (e.g.
> get_remote_estimate, ec_member_matches_foreign, create_cursor).
>
>
The TODO was present in v4 but not in v5 and is not present in v6 attached
here.. Formatted comment according estimate_path_cost_size(),
convert_prep_stmt_params().

> Aside from that, I think before we commit this, somebody should do
> some testing that demonstrates that this is actually a good idea. Not
> as part of the test case set for this patch, but just in general.
> Merge joins are typically going to be relevant for large tables, but
> the examples in the regression tests are necessarily tiny. I'd like
> to see some sample data and some sample queries that get appreciably
> faster with this code. If we can't find any, we don't need the code.
>
>
I tested the patch on my laptop with two types of queries, a join between
two foreign tables on different foreign servers (pointing to the same self
server) and a join between one foreign and one local table. The foreign
tables and servers are created using sort_pd_setup.sql attached. Foreign
tables pointed to table with index useful for join clause. Both the joining
tables had 10M rows. The execution time of query was measured for 100 runs
and average and standard deviation were calculated (using function
query_execution_stats() in script sort_pd.sql) and are presented below.

1. Query between foreign tables
SELECT ft1.val, ft2.val FROM ft1 join ft2 on (ft1.val = ft2.val)

Plan and timings without patch
EXPLAIN (VERBOSE, ANALYSE) :query ;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=508510.02..1129945.94 rows=999995 width=8) (actual
time=33803.826..82416.342 rows=10000000 loops=1)
Output: ft1.val, ft2.val
Hash Cond: (ft1.val = ft2.val)
-> Foreign Scan on public.ft1 (cost=100.00..344347.31 rows=9999977
width=4) (actual time=0.624..28531.803 rows=10000000 loops=1)
Output: ft1.val
Remote SQL: SELECT val FROM public.lt
-> Hash (cost=344347.31..344347.31 rows=9999977 width=4) (actual
time=33258.025..33258.025 rows=10000000 loops=1)
Output: ft2.val
Buckets: 131072 Batches: 256 Memory Usage: 2400kB
-> Foreign Scan on public.ft2 (cost=100.00..344347.31
rows=9999977 width=4) (actual time=22.171..28134.970 rows=10000000 loops=1)
Output: ft2.val
Remote SQL: SELECT val FROM public.lt
Planning time: 33.155 ms
Execution time: 82914.607 ms
(14 rows)

avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
78750.95487 | 2911.51825687913 | 74314.886 | 89358.464

Plan and timing with patch
EXPLAIN (VERBOSE, ANALYSE) :query ;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=200.86..1183070.86 rows=10000000 width=8) (actual
time=1.776..73140.219 rows=10000000 loops=1)
Output: ft1.val, ft2.val
Merge Cond: (ft1.val = ft2.val)
-> Foreign Scan on public.ft1 (cost=100.43..504035.43 rows=10000000
width=4) (actual time=0.937..30422.457 rows=10000000 loops=1)
Output: ft1.val, ft1.val2
Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
-> Materialize (cost=100.43..529035.43 rows=10000000 width=4) (actual
time=0.826..33448.822 rows=10000000 loops=1)
Output: ft2.val, ft2.val2
-> Foreign Scan on public.ft2 (cost=100.43..504035.43
rows=10000000 width=4) (actual time=0.818..31035.362 rows=10000000 loops=1)
Output: ft2.val, ft2.val2
Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
Planning time: 163.161 ms
Execution time: 73654.106 ms
(13 rows)

avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
71881.15916 | 819.091605498189 | 70197.312 | 74653.314

It can be observed that the with the patch, merge join strategy is used
instead of hash join and the execution time reduces by approx 9%. A desired
effect is that the deviation in the execution time has reduced heavily
(almost by 75%).

2. Join between local and foreign table

Without patch the plan and timings are
EXPLAIN (VERBOSE, ANALYSE) :query ;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=308410.66..1019846.69 rows=9999970 width=8) (actual
time=7674.681..47767.136 rows=10000000 loops=1)
Output: lt.val, ft1.val
Hash Cond: (ft1.val = lt.val)
-> Foreign Scan on public.ft1 (cost=100.00..344347.55 rows=9999985
width=4) (actual time=0.506..26679.980 rows=10000000 loops=1)
Output: ft1.val
Remote SQL: SELECT val FROM public.lt
-> Hash (cost=144247.85..144247.85 rows=9999985 width=4) (actual
time=7667.598..7667.598 rows=10000000 loops=1)
Output: lt.val
Buckets: 131072 Batches: 256 Memory Usage: 2400kB
-> Seq Scan on public.lt (cost=0.00..144247.85 rows=9999985
width=4) (actual time=0.018..2959.111 rows=10000000 loops=1)
Output: lt.val
Planning time: 8.668 ms
Execution time: 48209.365 ms
(13 rows)

SELECT avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time
FROM query_execution_stats(:'query', :num_samples);
avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
47246.46956 | 2579.42041949119 | 43603.411 | 56096.759

With the patch the plan and timings are
EXPLAIN (VERBOSE, ANALYSE) :query ;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=155.01..957924.85 rows=9999970 width=8) (actual
time=0.592..45125.356 rows=10000000 loops=1)
Output: lt.val, ft1.val
Merge Cond: (ft1.val = lt.val)
-> Foreign Scan on public.ft1 (cost=100.43..504038.91 rows=9999985
width=4) (actual time=0.551..30526.048 rows=10000000 loops=1)
Output: ft1.val, ft1.val2
Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
-> Index Only Scan using i_lt_val on public.lt (cost=0.43..303939.21
rows=9999985 width=4) (actual time=0.032..6192.406 rows=10000000 loops=1)
Output: lt.val
Heap Fetches: 10000000
Planning time: 9.043 ms
Execution time: 45666.023 ms
(11 rows)

avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
42803.36105 | 166.874491432755 | 42321.314 | 43316.902

Again observe that with the patch, merge join is used instead of hash join
and timing reduces by approx 9%. Again the deviation in execution reduces
heavily (almost by 75%). There is increase in planning time with the patch
owing to firing EXPLAIN on the foreign server.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
sort_pd_setup.sql text/x-sql 2.0 KB
pg_sort_all_pd_v6.patch binary/octet-stream 26.2 KB
sort_pd.sql text/x-sql 252 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2015-12-17 09:02:45 Re: extend pgbench expressions with functions
Previous Message Kenan Yao 2015-12-17 08:21:30 A question regarding LWLock in ProcSleep