Re: Re: parallel distinct union and aggregate support patch

From: "bucoo(at)sohu(dot)com" <bucoo(at)sohu(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: tgl <tgl(at)sss(dot)pgh(dot)pa(dot)us>, dilipbalaut <dilipbalaut(at)gmail(dot)com>, thomas(dot)munro <thomas(dot)munro(at)gmail(dot)com>, tomas(dot)vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, hlinnaka <hlinnaka(at)iki(dot)fi>, robertmhaas <robertmhaas(at)gmail(dot)com>, pgsql <pgsql(at)j-davis(dot)com>
Subject: Re: Re: parallel distinct union and aggregate support patch
Date: 2021-01-25 14:14:40
Message-ID: 2021012522143928190161@sohu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Now, I rewrite batch hashagg and sort, add some comment and combin too patches. base on master 2ad78a87f018260d4474eee63187e1cc73c9b976.
They are support rescan and change GUC enable_batch_hashagg/enable_batch_sort to max_hashagg_batches/max_sort_batch, default value is "0"(mean is disable).
The "max_hashagg_batches" in grouping sets each chain using this value, maybe we need a better algorithm.
Do not set "max_sort_batch" too large, because each tuplesort's work memory is "work_mem/max_sort_batch".

Next step I want use batch sort add parallel merge join(thinks Dilip Kumar) and except/intersect support after this patch commit, welcome to discuss.

Some test result:
hash group by: 17,974.797 ms -> 10,137.909 ms
sort group by: 117,475.380 ms -> 34,830.489 ms
grouping sets: 91,915.597 ms -> 24,585.103 ms
union: 95,765.297 ms -> 21,416.414 ms

---------------------------test details-------------------------------
Machine information:
Architecture: x86_64
CPU(s): 88
Thread(s) per core: 2
Core(s) per socket: 22
Socket(s): 2
NUMA node(s): 2
Model name: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz

prepare data:
begin;
create table gtest(id integer, txt text);
insert into gtest select t1.id,'txt'||t1.id from (select generate_series(1,10*1000*1000) id) t1,(select generate_series(1,10) id) t2;
analyze gtest;
commit;
set max_parallel_workers_per_gather=8;
set work_mem = '100MB';

hash aggregate:
explain (verbose,costs off,analyze)
select sum(id),txt from gtest group by txt;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Finalize HashAggregate (actual time=10832.805..17403.671 rows=10000000 loops=1)
Output: sum(id), txt
Group Key: gtest.txt
Batches: 29 Memory Usage: 102489kB Disk Usage: 404696kB
-> Gather (actual time=4389.345..7227.279 rows=10000058 loops=1)
Output: txt, (PARTIAL sum(id))
Workers Planned: 6
Workers Launched: 6
-> Partial HashAggregate (actual time=4353.147..5992.183 rows=1428580 loops=7)
Output: txt, PARTIAL sum(id)
Group Key: gtest.txt
Batches: 5 Memory Usage: 110641kB Disk Usage: 238424kB
Worker 0: actual time=4347.155..5954.088 rows=1398608 loops=1
Batches: 5 Memory Usage: 114737kB Disk Usage: 203928kB
Worker 1: actual time=4347.061..6209.121 rows=1443046 loops=1
Batches: 5 Memory Usage: 114737kB Disk Usage: 224384kB
Worker 2: actual time=4347.175..5882.065 rows=1408238 loops=1
Batches: 5 Memory Usage: 110641kB Disk Usage: 216360kB
Worker 3: actual time=4347.193..6015.830 rows=1477568 loops=1
Batches: 5 Memory Usage: 110641kB Disk Usage: 240824kB
Worker 4: actual time=4347.210..5950.730 rows=1404288 loops=1
Batches: 5 Memory Usage: 110641kB Disk Usage: 214872kB
Worker 5: actual time=4347.482..6064.460 rows=1439454 loops=1
Batches: 5 Memory Usage: 110641kB Disk Usage: 239400kB
-> Parallel Seq Scan on public.gtest (actual time=0.051..1216.378 rows=14285714 loops=7)
Output: id, txt
Worker 0: actual time=0.048..1219.133 rows=13986000 loops=1
Worker 1: actual time=0.047..1214.860 rows=14430370 loops=1
Worker 2: actual time=0.051..1222.124 rows=14082300 loops=1
Worker 3: actual time=0.061..1213.851 rows=14775580 loops=1
Worker 4: actual time=0.073..1216.712 rows=14042795 loops=1
Worker 5: actual time=0.049..1210.870 rows=14394480 loops=1
Planning Time: 0.673 ms
Execution Time: 17974.797 ms
batch hash aggregate:
set max_hashagg_batches = 100;
explain (verbose,costs off,analyze)
select sum(id),txt from gtest group by txt;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Gather (actual time=5050.110..9757.292 rows=10000000 loops=1)
Output: (sum(id)), txt
Workers Planned: 6
Workers Launched: 6
-> Parallel BatchHashAggregate (actual time=5032.178..7810.979 rows=1428571 loops=7)
Output: sum(id), txt
Group Key: gtest.txt
Worker 0: actual time=5016.488..7694.715 rows=1399958 loops=1
Worker 1: actual time=5021.651..7942.628 rows=1501753 loops=1
Worker 2: actual time=5018.327..7944.842 rows=1400176 loops=1
Worker 3: actual time=5082.977..7973.635 rows=1400818 loops=1
Worker 4: actual time=5019.229..7847.522 rows=1499952 loops=1
Worker 5: actual time=5017.086..7667.116 rows=1398470 loops=1
-> Parallel Seq Scan on public.gtest (actual time=0.055..1378.237 rows=14285714 loops=7)
Output: id, txt
Worker 0: actual time=0.057..1349.870 rows=14533515 loops=1
Worker 1: actual time=0.052..1376.305 rows=13847620 loops=1
Worker 2: actual time=0.068..1382.226 rows=13836705 loops=1
Worker 3: actual time=0.071..1405.669 rows=13856130 loops=1
Worker 4: actual time=0.055..1406.186 rows=14677345 loops=1
Worker 5: actual time=0.045..1351.142 rows=15344825 loops=1
Planning Time: 0.250 ms
Execution Time: 10137.909 ms

sort aggregate:
set enable_hashagg = off;
set max_hashagg_batches = 0;
explain (verbose,costs off,analyze)
select sum(id),txt from gtest group by txt;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Finalize GroupAggregate (actual time=10370.559..116494.922 rows=10000000 loops=1)
Output: sum(id), txt
Group Key: gtest.txt
-> Gather Merge (actual time=10370.487..112470.148 rows=10000059 loops=1)
Output: txt, (PARTIAL sum(id))
Workers Planned: 6
Workers Launched: 6
-> Partial GroupAggregate (actual time=8608.563..24526.716 rows=1428580 loops=7)
Output: txt, PARTIAL sum(id)
Group Key: gtest.txt
Worker 0: actual time=8283.755..18641.475 rows=887626 loops=1
Worker 1: actual time=8303.984..26206.673 rows=1536832 loops=1
Worker 2: actual time=8290.611..28110.145 rows=1676544 loops=1
Worker 3: actual time=10347.326..29912.135 rows=1783536 loops=1
Worker 4: actual time=8329.604..20262.795 rows=980352 loops=1
Worker 5: actual time=8322.877..27957.446 rows=1758958 loops=1
-> Sort (actual time=8608.501..21752.009 rows=14285714 loops=7)
Output: txt, id
Sort Key: gtest.txt
Sort Method: external merge Disk: 349760kB
Worker 0: actual time=8283.648..16831.068 rows=8876115 loops=1
Sort Method: external merge Disk: 225832kB
Worker 1: actual time=8303.927..23053.078 rows=15368320 loops=1
Sort Method: external merge Disk: 391008kB
Worker 2: actual time=8290.556..24735.395 rows=16765440 loops=1
Sort Method: external merge Disk: 426552kB
Worker 3: actual time=10347.264..26438.333 rows=17835210 loops=1
Sort Method: external merge Disk: 453768kB
Worker 4: actual time=8329.534..18248.302 rows=9803520 loops=1
Sort Method: external merge Disk: 249408kB
Worker 5: actual time=8322.827..24480.383 rows=17589430 loops=1
Sort Method: external merge Disk: 447520kB
-> Parallel Seq Scan on public.gtest (actual time=51.618..1530.850 rows=14285714 loops=7)
Output: txt, id
Worker 0: actual time=49.907..1001.606 rows=8876115 loops=1
Worker 1: actual time=51.011..1665.980 rows=15368320 loops=1
Worker 2: actual time=50.087..1812.426 rows=16765440 loops=1
Worker 3: actual time=51.010..1828.299 rows=17835210 loops=1
Worker 4: actual time=42.614..1077.896 rows=9803520 loops=1
Worker 5: actual time=51.010..1790.012 rows=17589430 loops=1
Planning Time: 0.119 ms
Execution Time: 117475.380 ms
batch sort aggregate:
set max_sort_batches = 21;
explain (verbose,costs off,analyze)
select sum(id),txt from gtest group by txt;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Gather (actual time=18699.622..34438.083 rows=10000000 loops=1)
Output: (sum(id)), txt
Workers Planned: 6
Workers Launched: 6
-> GroupAggregate (actual time=18671.875..31121.607 rows=1428571 loops=7)
Output: sum(id), txt
Group Key: gtest.txt
Worker 0: actual time=18669.038..30913.680 rows=1427622 loops=1
Worker 1: actual time=18674.356..31045.516 rows=1430448 loops=1
Worker 2: actual time=18677.565..31375.340 rows=1427636 loops=1
Worker 3: actual time=18667.879..31359.458 rows=1427935 loops=1
Worker 4: actual time=18669.760..31263.414 rows=1430220 loops=1
Worker 5: actual time=18645.428..30813.141 rows=1427411 loops=1
-> Parallel BatchSort (actual time=18671.796..29348.606 rows=14285714 loops=7)
Output: txt, id
Sort Key: gtest.txt
batches: 21
Worker 0: actual time=18668.856..29172.519 rows=14276220 loops=1
Worker 1: actual time=18674.287..29280.794 rows=14304480 loops=1
Worker 2: actual time=18677.501..29569.974 rows=14276360 loops=1
Worker 3: actual time=18667.801..29558.286 rows=14279350 loops=1
Worker 4: actual time=18669.689..29468.636 rows=14302200 loops=1
Worker 5: actual time=18645.367..29076.665 rows=14274110 loops=1
-> Parallel Seq Scan on public.gtest (actual time=50.164..1893.727 rows=14285714 loops=7)
Output: txt, id
Worker 0: actual time=50.058..1818.959 rows=13953440 loops=1
Worker 1: actual time=50.974..1723.268 rows=13066735 loops=1
Worker 2: actual time=48.050..1855.469 rows=13985175 loops=1
Worker 3: actual time=49.640..1791.897 rows=12673240 loops=1
Worker 4: actual time=48.027..1932.927 rows=14586880 loops=1
Worker 5: actual time=51.151..2094.981 rows=16360290 loops=1
Planning Time: 0.160 ms
Execution Time: 34830.489 ms

normal grouping sets:
set enable_hashagg = on;
set max_sort_batches = 0;
set max_hashagg_batches = 0;
explain (costs off,verbose,analyze)
select sum(id),txt from gtest group by grouping sets(id,txt,());
QUERY PLAN
----------------------------------------------------------------------------------------------------------
MixedAggregate (actual time=4563.123..90348.608 rows=20000001 loops=1)
Output: sum(id), txt, id
Hash Key: gtest.txt
Group Key: gtest.id
Group Key: ()
Batches: 29 Memory Usage: 114737kB Disk Usage: 3241968kB
-> Gather Merge (actual time=4563.070..39429.593 rows=100000000 loops=1)
Output: txt, id
Workers Planned: 6
Workers Launched: 6
-> Sort (actual time=4493.638..7532.910 rows=14285714 loops=7)
Output: txt, id
Sort Key: gtest.id
Sort Method: external merge Disk: 353080kB
Worker 0: actual time=4474.665..7853.595 rows=14327510 loops=1
Sort Method: external merge Disk: 364528kB
Worker 1: actual time=4492.273..7796.141 rows=14613250 loops=1
Sort Method: external merge Disk: 371776kB
Worker 2: actual time=4472.937..7626.318 rows=14339905 loops=1
Sort Method: external merge Disk: 364840kB
Worker 3: actual time=4480.141..7730.419 rows=14406135 loops=1
Sort Method: external merge Disk: 366528kB
Worker 4: actual time=4490.723..7581.102 rows=13971200 loops=1
Sort Method: external merge Disk: 355096kB
Worker 5: actual time=4482.204..7894.434 rows=14464410 loops=1
Sort Method: external merge Disk: 368008kB
-> Parallel Seq Scan on public.gtest (actual time=27.040..1514.516 rows=14285714 loops=7)
Output: txt, id
Worker 0: actual time=23.111..1514.219 rows=14327510 loops=1
Worker 1: actual time=22.696..1528.771 rows=14613250 loops=1
Worker 2: actual time=23.119..1519.190 rows=14339905 loops=1
Worker 3: actual time=22.705..1525.183 rows=14406135 loops=1
Worker 4: actual time=23.134..1509.694 rows=13971200 loops=1
Worker 5: actual time=23.652..1516.585 rows=14464410 loops=1
Planning Time: 0.162 ms
Execution Time: 91915.597 ms

batch grouping sets:
set max_hashagg_batches = 100;
explain (costs off,verbose,analyze)
select sum(id),txt from gtest group by grouping sets(id,txt,());
QUERY PLAN
---------------------------------------------------------------------------------------------------
Gather (actual time=9082.581..23203.803 rows=20000001 loops=1)
Output: (sum(id)), txt, id
Workers Planned: 6
Workers Launched: 6
-> Parallel BatchHashAggregate (actual time=9040.895..15911.190 rows=2857143 loops=7)
Output: sum(id), txt, id
Group Key: gtest.id
Group Key: ()
Group Key: gtest.txt
Worker 0: actual time=9031.714..15499.292 rows=3101124 loops=1
Worker 1: actual time=9038.217..15403.655 rows=3100997 loops=1
Worker 2: actual time=9030.557..15157.267 rows=3103320 loops=1
Worker 3: actual time=9034.391..15537.851 rows=3100505 loops=1
Worker 4: actual time=9037.079..19823.359 rows=1400191 loops=1
Worker 5: actual time=9032.359..15012.338 rows=3097137 loops=1
-> Parallel Seq Scan on public.gtest (actual time=0.052..1506.109 rows=14285714 loops=7)
Output: id, txt
Worker 0: actual time=0.058..1521.705 rows=13759375 loops=1
Worker 1: actual time=0.054..1514.218 rows=13758635 loops=1
Worker 2: actual time=0.062..1531.244 rows=14456270 loops=1
Worker 3: actual time=0.050..1506.569 rows=14451930 loops=1
Worker 4: actual time=0.053..1495.908 rows=15411240 loops=1
Worker 5: actual time=0.055..1503.382 rows=14988885 loops=1
Planning Time: 0.160 ms
Execution Time: 24585.103 ms

normal union:
set max_hashagg_batches = 0;
set max_sort_batches = 0;
explain (verbose,costs false,analyze)
select * from gtest union select * from gtest;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Unique (actual time=53939.294..94666.573 rows=10000000 loops=1)
Output: gtest.id, gtest.txt
-> Sort (actual time=53939.292..76581.157 rows=200000000 loops=1)
Output: gtest.id, gtest.txt
Sort Key: gtest.id, gtest.txt
Sort Method: external merge Disk: 4871024kB
-> Append (actual time=0.020..25832.476 rows=200000000 loops=1)
-> Seq Scan on public.gtest (actual time=0.019..7074.113 rows=100000000 loops=1)
Output: gtest.id, gtest.txt
-> Seq Scan on public.gtest gtest_1 (actual time=0.006..7067.898 rows=100000000 loops=1)
Output: gtest_1.id, gtest_1.txt
Planning Time: 0.152 ms
Execution Time: 95765.297 ms

batch hash aggregate union:
set max_hashagg_batches = 100;
explain (verbose,costs false,analyze)
select * from gtest union select * from gtest;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Gather (actual time=11623.986..21021.317 rows=10000000 loops=1)
Output: gtest.id, gtest.txt
Workers Planned: 6
Workers Launched: 6
-> Parallel BatchHashAggregate (actual time=11636.753..16584.067 rows=1428571 loops=7)
Output: gtest.id, gtest.txt
Group Key: gtest.id, gtest.txt
Worker 0: actual time=11631.225..16846.376 rows=1500587 loops=1
Worker 1: actual time=11553.019..16233.006 rows=1397874 loops=1
Worker 2: actual time=11581.523..16807.962 rows=1499049 loops=1
Worker 3: actual time=11593.865..16416.381 rows=1399579 loops=1
Worker 4: actual time=11772.115..16783.605 rows=1400961 loops=1
Worker 5: actual time=11702.415..16571.841 rows=1400943 loops=1
-> Parallel Append (actual time=0.047..4339.450 rows=28571429 loops=7)
Worker 0: actual time=0.062..4396.130 rows=28591565 loops=1
Worker 1: actual time=0.053..4383.983 rows=29536360 loops=1
Worker 2: actual time=0.045..4305.253 rows=28282900 loops=1
Worker 3: actual time=0.053..4295.805 rows=28409625 loops=1
Worker 4: actual time=0.061..4314.450 rows=28363645 loops=1
Worker 5: actual time=0.015..4311.121 rows=29163585 loops=1
-> Parallel Seq Scan on public.gtest (actual time=0.030..1201.563 rows=14285714 loops=7)
Output: gtest.id, gtest.txt
Worker 0: actual time=0.019..281.903 rows=3277090 loops=1
Worker 1: actual time=0.050..2473.135 rows=29536360 loops=1
Worker 2: actual time=0.021..273.766 rows=3252955 loops=1
Worker 3: actual time=0.018..285.911 rows=3185145 loops=1
Worker 4: actual time=0.058..2387.626 rows=28363645 loops=1
Worker 5: actual time=0.013..2432.342 rows=29163585 loops=1
-> Parallel Seq Scan on public.gtest gtest_1 (actual time=0.048..2140.373 rows=25000000 loops=4)
Output: gtest_1.id, gtest_1.txt
Worker 0: actual time=0.059..2173.690 rows=25314475 loops=1
Worker 2: actual time=0.043..2114.314 rows=25029945 loops=1
Worker 3: actual time=0.050..2142.670 rows=25224480 loops=1
Planning Time: 0.137 ms
Execution Time: 21416.414 ms

bucoo(at)sohu(dot)com

Attachment Content-Type Size
support-parallel-union-distinct-aggregate-using-batc.patch application/octet-stream 145.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-01-25 14:44:56 Re: adding wait_start column to pg_locks
Previous Message Dave Cramer 2021-01-25 14:09:09 Re: Error on failed COMMIT