Possible regression with gather merge.

From: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Possible regression with gather merge.
Date: 2017-03-22 05:37:36
Message-ID: CAD__Ouj65DF8yv0LRu-+=EK8kHp+1==byxa3Y8HEOCUG=hEyCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I accidently encountered a case where gather merge was picked as
default but disabling same by setting max_parallel_workers_per_gather
= 0; produced a non-parallel plan which was faster than gather merge,
but its cost is marked too high when compared to gather merge.

I guess we need some cost adjustment is planner code.

Test setting
=========
create table test as (select id, (random()*10000)::int as v1, random() as
v2 from generate_series(1,1000000) id);
create index test_v1_idx on test (v1);

Server setting is default.

postgres=# explain analyze select * from test order by v1, v2 limit 10;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=19576.71..19577.88 rows=10 width=16) (actual
time=265.989..265.995 rows=10 loops=1)
-> Gather Merge (cost=19576.71..116805.80 rows=833334 width=16)
(actual time=265.987..265.992 rows=10 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=18576.69..19618.36 rows=416667 width=16)
(actual time=250.202..250.424 rows=911 loops=3)
Sort Key: v1, v2
Sort Method: external merge Disk: 9272kB
-> Parallel Seq Scan on test (cost=0.00..9572.67
rows=416667 width=16) (actual time=0.053..41.397 rows=333333 loops=3)
Planning time: 0.193 ms
Execution time: 271.222 ms

postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Limit (cost=37015.64..37015.67 rows=10 width=16) (actual
time=211.582..211.584 rows=10 loops=1)
-> Sort (cost=37015.64..39515.64 rows=1000000 width=16) (actual
time=211.581..211.582 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on test (cost=0.00..15406.00 rows=1000000
width=16) (actual time=0.085..107.522 rows=1000000 loops=1)
Planning time: 0.093 ms
Execution time: 211.608 ms
(7 rows)

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-03-22 05:50:34 Re: Potential data loss of 2PC files
Previous Message Michael Paquier 2017-03-22 05:31:38 Re: exposing wait events for non-backends (was: Tracking wait event for latches)