When the optimizer determines that parallel query is the fastest execution strategy for a particular query, it will create a query plan which includes a Gather or Gather Merge node. Here is a simple example:
EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; QUERY PLAN ------------------------------------------------------------------------------------- Gather (cost=1000.00..217018.43 rows=1 width=97) Workers Planned: 2 -> Parallel Seq Scan on pgbench_accounts (cost=0.00..216018.33 rows=1 width=97) Filter: (filler ~~ '%x%'::text) (4 rows)
In all cases, the Gather
or
Gather Merge
node will have exactly
one child plan, which is the portion of the plan that will be
executed in parallel. If the Gather
or
Gather Merge
node is at the very top
of the plan tree, then the entire query will execute in parallel.
If it is somewhere else in the plan tree, then only the portion of
the plan below it will run in parallel. In the example above, the
query accesses only one table, so there is only one plan node other
than the Gather
node itself; since
that plan node is a child of the Gather
node, it will run in parallel.
Using EXPLAIN, you can see the
number of workers chosen by the planner. When the Gather
node is reached during query execution, the
process which is implementing the user's session will request a
number of background
worker processes equal to the number of workers chosen by the
planner. The number of background workers that the planner will
consider using is limited to at most max_parallel_workers_per_gather.
The total number of background workers that can exist at any one
time is limited by both max_worker_processes
and max_parallel_workers.
Therefore, it is possible for a parallel query to run with fewer
workers than planned, or even with no workers at all. The optimal
plan may depend on the number of workers that are available, so
this can result in poor query performance. If this occurrence is
frequent, consider increasing max_worker_processes
and max_parallel_workers
so that more workers can be
run simultaneously or alternatively reducing max_parallel_workers_per_gather
so that the
planner requests fewer workers.
Every background worker process which is successfully started
for a given parallel query will execute the parallel portion of the
plan. The leader will also execute that portion of the plan, but it
has an additional responsibility: it must also read all of the
tuples generated by the workers. When the parallel portion of the
plan generates only a small number of tuples, the leader will often
behave very much like an additional worker, speeding up query
execution. Conversely, when the parallel portion of the plan
generates a large number of tuples, the leader may be almost
entirely occupied with reading the tuples generated by the workers
and performing any further processing steps which are required by
plan nodes above the level of the Gather
node or Gather
Merge
node. In such cases, the leader will do very little of
the work of executing the parallel portion of the plan.
When the node at the top of the parallel portion of the plan is
Gather Merge
rather than Gather
, it indicates that each process executing
the parallel portion of the plan is producing tuples in sorted
order, and that the leader is performing an order-preserving merge.
In contrast, Gather
reads tuples from
the workers in whatever order is convenient, destroying any sort
order that may have existed.
If you see anything in the documentation that is not correct, does not match your experience with the particular feature or requires further clarification, please use this form to report a documentation issue.