Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2014-12-18 15:52:26
Message-ID: CAA4eK1+O2fZe88NpxmMne94bwBKUEcx-2yuAe-MTYpbuoJ0NCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 8, 2014 at 10:40 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
> On Sat, Dec 6, 2014 at 5:37 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >
>
> So to summarize my understanding, below are the set of things
> which I should work on and in the order they are listed.
>
> 1. Push down qualification
> 2. Performance Data
> 3. Improve the way to push down the information related to worker.
> 4. Dynamic allocation of work for workers.
>
>

I have worked on the patch to accomplish above mentioned points
1, 2 and partly 3 and would like to share the progress with community.
If the statement contain quals that don't have volatile functions, then
they will be pushed down and the parallel can will be considered for
cost evaluation. I think eventually we might need some better way
to decide about which kind of functions are okay to be pushed.
I have also unified the way information is passed from master backend
to worker backends which is convert each node to string that has to be
passed and then later workers convert string to node, this has simplified
the related code.

I have taken performance data for different selectivity and complexity of
qual expressions, I understand that there will be other kind of scenario's
which we need to consider, however I think the current set of tests is good
place to start, please feel free to comment on kind of scenario's which you
want me to check

Performance Data
------------------------------
*m/c details*
IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB
*non-default settings in postgresql.conf*
max_connections=300
shared_buffers = 8GB
checkpoint_segments = 300
checkpoint_timeout = 30min
max_worker_processes=100

create table tbl_perf(c1 int, c2 char(1000));

30 million rows
------------------------
insert into tbl_perf values(generate_series(1,10000000),'aaaaa');
insert into tbl_perf values(generate_series(10000000,30000000),'aaaaa');

Function used in quals
-----------------------------------
A simple function which will perform some calculation and return
the value passed which can be used in qual condition.

create or replace function calc_factorial(a integer, fact_val integer)
returns integer
as $$
begin
perform (fact_val)!;
return a;
end;
$$ language plpgsql STABLE;

In below data,
num_workers - number of parallel workers configured using
parallel_seqscan_degree. 0, means it will execute sequence
scan and greater than 0 means parallel sequence scan.

exec_time - Execution Time given by Explain Analyze statement.

*Tests having quals containing function evaluation in qual*
*expressions.*

*Test-1*
*Query -* Explain analyze select c1 from tbl_perf where
c1 > calc_factorial(29700000,10) and c2 like '%aa%';
*Selection_criteria – *1% of rows will be selected

*num_workers* *exec_time (ms)* 0 229534 2 121741 4 67051 8 35607 16
24743

*Test-2*
*Query - *Explain analyze select c1 from tbl_perf where
c1 > calc_factorial(27000000,10) and c2 like '%aa%';
*Selection_criteria – *10% of rows will be selected

*num_workers* *exec_time (ms)* 0 226671 2 151587 4 93648 8 70540 16
55466

*Test-3*
*Query -* Explain analyze select c1 from tbl_perf
where c1 > calc_factorial(22500000,10) and c2 like '%aa%';
*Selection_criteria –* 25% of rows will be selected

*num_workers* *exec_time (ms)* 0 232673 2 197609 4 142686 8 111664 16
98097

*Tests having quals containing simple expressions in qual.*

*Test-4*
*Query - *Explain analyze select c1 from tbl_perf
where c1 > 29700000 and c2 like '%aa%';
*Selection_criteria –* 1% of rows will be selected

*num_workers* *exec_time (ms)* 0 15505 2 9155 4 6030 8 4523 16 4459
32 8259 64 13388
*Test-5*
*Query - *Explain analyze select c1 from tbl_perf
where c1 > 28500000 and c2 like '%aa%';
*Selection_criteria –* 5% of rows will be selected

*num_workers* *exec_time (ms)* 0 18906 2 13446 4 8970 8 7887 16 10403
*Test-6*
*Query -* Explain analyze select c1 from tbl_perf
where c1 > 27000000 and c2 like '%aa%';
*Selection_criteria – *10% of rows will be selected

*num_workers* *exec_time (ms)* 0 16132 2 23780 4 20275 8 11390 16
11418

Conclusion
------------------
1. Parallel workers help a lot when there is an expensive qualification
to evaluated, the more expensive the qualification the more better are
results.
2. It works well for low selectivity quals and as the selectivity increases,
the benefit tends to go down due to additional tuple communication cost
between workers and master backend.
3. After certain point, increasing having more number of workers won't
help and rather have negative impact, refer Test-4.

I think as discussed previously we need to introduce 2 additional cost
variables (parallel_startup_cost, cpu_tuple_communication_cost) to
estimate the parallel seq scan cost so that when the tables are small
or selectivity is high, it should increase the cost of parallel plan.

Thoughts and feedback for the current state of patch is welcome.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-12-18 16:03:34 Re: Parallel Seq Scan
Previous Message Heikki Linnakangas 2014-12-18 15:51:35 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}