RE: Parallel Seq Scan vs kernel read ahead

From: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To: 'David Rowley' <dgrowleyml(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Parallel Seq Scan vs kernel read ahead
Date: 2020-07-22 06:17:33
Message-ID: OSBPR01MB23415E69D44DCEA762760E5FEF790@OSBPR01MB2341.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, July 22, 2020 2:21 PM (GMT+9), David Rowley wrote:

> On Wed, 22 Jul 2020 at 16:40, k(dot)jamison(at)fujitsu(dot)com <k(dot)jamison(at)fujitsu(dot)com>
> wrote:
> > I used the default max_parallel_workers & max_worker_proceses which is 8 by
> default in postgresql.conf.
> > IOW, I ran all those tests with maximum of 8 processes set. But my
> > query planner capped both the Workers Planned and Launched at 6 for
> > some reason when increasing the value for max_parallel_workers_per_gather.
>
> max_parallel_workers_per_gather just imposes a limit on the planner as to the
> maximum number of parallel workers it may choose for a given parallel portion of
> a plan. The actual number of workers the planner will decide is best to use is
> based on the size of the relation. More pages = more workers. It sounds like in
> this case the planner didn't think it was worth using more than 6 workers.
>
> The parallel_workers reloption, when not set to -1 overwrites the planner's
> decision on how many workers to use. It'll just always try to use
> "parallel_workers".
>
> > However, when I used the ALTER TABLE SET (parallel_workers = N) based
> > from your suggestion, the query planner acquired that set value only for
> "Workers Planned", but not for "Workers Launched".
> > The behavior of query planner is also different when I also set the
> > value of max_worker_processes and max_parallel_workers to parallel_workers
> + 1.
>
> When it comes to execution, the executor is limited to how many parallel worker
> processes are available to execute the plan. If all workers happen to be busy with
> other tasks then it may find itself having to process the entire query in itself
> without any help from workers. Or there may be workers available, just not as
> many as the planner picked to execute the query.

Even though I read the documentation [1][2] on parallel query, I might not have
understood it clearly yet. So thank you very much for explaining simpler how the
relation size, GUCs, and reloption affect the query planner's behavior
So in this test case, I shouldn't force the workers to have same values for workers
planned and workers launched, is it correct? To just let the optimizer do its own decision.

> The number of available workers is configured with the
> "max_parallel_workers". That's set in postgresql.conf. PostgreSQL
> won't complain if you try to set a relation's parallel_workers reloption to a number
> higher than the "max_parallel_workers" GUC.
> "max_parallel_workers" is further limited by "max_worker_processes".
> Likely you'll want to set both those to at least 32 for this test, then just adjust the
> relation's parallel_workers setting for each test.
>
Thank you for the advice. For the same test case [3], I will use the following configuration:
shared_buffers = 32MB
max_parallel_workers =32
max_worker_processes = 32

Maybe the relation size is also small as you mentioned, that the query optimizer decided
to only use 6 workers in my previous test. So let me see first if the results would vary
again with the above configuration and testing different values for parallel_workers.

Kind regards,
Kirk Jamison

[1] https://www.postgresql.org/docs/13/how-parallel-query-works.html
[2] https://www.postgresql.org/docs/current/runtime-config-resource.html
[3] https://www.postgresql.org/message-id/CADwEdoqirzK3H8bB=xyJ192EZCNwGfcCa_WJ5GHVM7Sv8oenuA@mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2020-07-22 07:41:48 Re: Transactions involving multiple postgres foreign servers, take 2
Previous Message Amit Langote 2020-07-22 06:16:02 Re: problem with RETURNING and update row movement