Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, John Gorman <johngorman2(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-01-29 12:21:54
Message-ID: CAA4eK1JHCmN2X1LjQ4bOmLApt+btOuid5Vqqk5G6dDFV69iyHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 28, 2015 at 8:59 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
> I have tried the tests again and found that I have forgotten to increase
> max_worker_processes due to which the data is so different. Basically
> at higher client count it is just scanning lesser number of blocks in
> fixed chunk approach. So today I again tried with changing
> max_worker_processes and found that there is not much difference in
> performance at higher client count. I will take some more data for
> both block_by_block and fixed_chunk approach and repost the data.
>

I have again taken the data and found that there is not much difference
either between block-by-block or fixed_chuck approach, the data is at
end of mail for your reference. There is variation in some cases like in
fixed_chunk approach, in 8 workers case it is showing lesser time, however
on certain executions it has taken almost the same time as other workers.

Now if we go with block-by-block approach then we have advantage that
the work distribution granularity will be smaller and hence better and if
we go with chunk-by-chunk (fixed_chunk of 1GB) approach, then there
is good chance that kernel can do the better optimization for reading it.

Based on inputs on this thread, one way for execution strategy could
be:

a. In optimizer, based on effective_io_concurrency, size of relation and
parallel_seqscan_degree, we can decide how many workers can be
used for executing the plan
- choose the number_of_workers equal to effective_io_concurrency,
if it is less than parallel_seqscan_degree, else number_of_workers
will be equal to parallel_seqscan_degree.
- if the size of relation is greater than number_of_workers times GB
(if number_of_workers is 8, then we need to compare the size of
relation with 8GB), then keep number_of_workers intact and distribute
the remaining chunks/segments during execution, else
reduce the number_of_workers such that each worker gets 1GB
to operate.
- if the size of relation is less than 1GB, then we can either not
choose the parallel_seqscan at all or could use smaller chunks
or could use block-by-block approach to execute.
- here we need to consider other parameters like parallel_setup
parallel_startup and tuple_communication cost as well.

b. In executor, if less workers are available than what are required
for statement execution, then we can redistribute the remaining
work among workers.

Performance Data - Before first run of each worker, I have executed
drop_caches to clear the cache and restarted the server, so we can
assume that except Run-1, all other runs have some caching effect.

*Fixed-Chunks*

*No. of workers/Time (ms)* 0 8 16 32 Run-1 322822 245759 330097 330002
Run-2 275685 275428 301625 286251 Run-3 252129 244167 303494 278604 Run-4
252528 259273 250438 258636 Run-5 250612 242072 235384 265918

*Block-By-Block*

*No. of workers/Time (ms)* 0 8 16 32 Run-1 323084 341950 338999 334100
Run-2 310968 349366 344272 322643 Run-3 250312 336227 346276 322274 Run-4
262744 314489 351652 325135 Run-5 265987 316260 342924 319200

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2015-01-29 12:22:29 Re: jsonb, unicode escapes and escaped backslashes
Previous Message Fabrízio de Royes Mello 2015-01-29 12:19:14 Re: Providing catalog view to pg_hba.conf file - Patch submission