Re: Parallel Query

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: Krithika Venkatesh <krithikavenkatesh31(at)gmail(dot)com>
Cc: pgsql <pgsql-general(at)postgresql(dot)org>
Subject: Re: Parallel Query
Date: 2018-01-11 09:05:46
Message-ID: CAOGQiiNVXF-n800pAMKMi=UEn99pXBxHv8TmrMSUVvr70Pzjnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Jan 11, 2018 at 12:24 PM, Krithika Venkatesh <
krithikavenkatesh31(at)gmail(dot)com> wrote:

> Hi All,
>
> I was reading about parallel queries in postgresql 10.
>
> I came across the property max_parallel_workers_per_gather. It sets the
> maximum number of workers that can be started by a single Gather or Gather
> Merge node.
>
> I am not able to understand gather and gather merge node.
>
> With parallel query in picture, there are two new types of nodes
introduced in postgresql 10 viz, Gather node and other is Gather-Merge
node.
Now, if seq-scan has to be performed in parallel, then gather node comes
into picture, which you may think of as a master node for the parallel
workers which are performing sequential scan. e.g.
QUERY PLAN
------------------------------------------------------------------
Gather (cost=0.00..7.60 rows=246 width=8)
Workers Planned: 2
-> Parallel Seq Scan on t (cost=0.00..7.60 rows=102 width=8)
Filter: (i < 50)
(4 rows)

Here, there are two workers which are in parallel scanning the table using
seq-scan and gather is the process which collects the result of this data
and passes it to the node above it, if any.

Now, just like in seq-scan the order of tuples is not maintained so is the
case in gather node. However, for the cases wherein you need to maintain
the order, gather-merge node comes into picture. For example when index
scan is used in parallel then gather-merge will be above it to collect the
results of parallel scans respecting the order of indexes.

What is the difference between max_parallel_workers_per_gather and
> max_parallel_processes. What is meant by gather node.
>
> As you mentioned above, max_parallel_workers_per_gather is the total
number of parallel workers allowed for a node. The parameter
max_parallel_processes sets the upper limit of total parallel workers
running. As a base rule, the value of max_parallel_workers_per_gather
should be less than or equal to max_parallel_processes, otherwise the value
of max_parallel_processes will be the maximum allowable workers for a
parallel node.

>
You may have a look at the documentation of these parameters for more
details --
https://www.postgresql.org/docs/10/static/when-can-parallel-query-be-used.html

--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message pinker 2018-01-11 09:15:07 RE: How Many Partitions are Good Performing
Previous Message Durumdara 2018-01-11 08:57:07 Re: String comparison problem in select - too many results