Re: Parallel Seq Scan vs kernel read ahead

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan vs kernel read ahead
Date: 2020-07-22 05:20:52
Message-ID: CAApHDvrTg_6cpeTZxdS=iQcCam4U_y+paBGHYNB5znD+X9U9KA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 22 Jul 2020 at 16:40, k(dot)jamison(at)fujitsu(dot)com
<k(dot)jamison(at)fujitsu(dot)com> wrote:
> I used the default max_parallel_workers & max_worker_proceses which is 8 by default in postgresql.conf.
> IOW, I ran all those tests with maximum of 8 processes set. But my query planner capped both the
> Workers Planned and Launched at 6 for some reason when increasing the value for
> max_parallel_workers_per_gather.

max_parallel_workers_per_gather just imposes a limit on the planner as
to the maximum number of parallel workers it may choose for a given
parallel portion of a plan. The actual number of workers the planner
will decide is best to use is based on the size of the relation. More
pages = more workers. It sounds like in this case the planner didn't
think it was worth using more than 6 workers.

The parallel_workers reloption, when not set to -1 overwrites the
planner's decision on how many workers to use. It'll just always try
to use "parallel_workers".

> However, when I used the ALTER TABLE SET (parallel_workers = N) based from your suggestion,
> the query planner acquired that set value only for "Workers Planned", but not for "Workers Launched".
> The behavior of query planner is also different when I also set the value of max_worker_processes
> and max_parallel_workers to parallel_workers + 1.

When it comes to execution, the executor is limited to how many
parallel worker processes are available to execute the plan. If all
workers happen to be busy with other tasks then it may find itself
having to process the entire query in itself without any help from
workers. Or there may be workers available, just not as many as the
planner picked to execute the query.

The number of available workers is configured with the
"max_parallel_workers". That's set in postgresql.conf. PostgreSQL
won't complain if you try to set a relation's parallel_workers
reloption to a number higher than the "max_parallel_workers" GUC.
"max_parallel_workers" is further limited by "max_worker_processes".
Likely you'll want to set both those to at least 32 for this test,
then just adjust the relation's parallel_workers setting for each
test.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-07-22 05:35:10 Re: OpenSSL randomness seeding
Previous Message Noah Misch 2020-07-22 05:00:20 Re: OpenSSL randomness seeding