Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-03-25 15:49:35
Message-ID: CAA4eK1JM0rdzfrTdghm4M+nYZemuRgSLT+ERLaACDU3LBU360A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 25, 2015 at 5:16 PM, Thom Brown <thom(at)linux(dot)com> wrote:
>
> On 25 March 2015 at 10:27, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> Fixed the reported issue on assess-parallel-safety thread and another
>> bug caught while testing joins and integrated with latest version of
>> parallel-mode patch (parallel-mode-v9 patch).
>>
>> Apart from that I have moved the Initialization of dsm segement from
>> InitNode phase to ExecFunnel() (on first execution) as per suggestion
>> from Robert. The main idea is that as it creates large shared memory
>> segment, so do the work when it is really required.
>>
>>
>> HEAD Commit-Id: 11226e38
>> parallel-mode-v9.patch [2]
>> assess-parallel-safety-v4.patch [1]
>> parallel-heap-scan.patch [3]
>> parallel_seqscan_v12.patch (Attached with this mail)
>>
>> [1] -
http://www.postgresql.org/message-id/CA+TgmobJSuefiPOk6+i9WERUgeAB3ggJv7JxLX+r6S5SYydBRQ@mail.gmail.com
>> [2] -
http://www.postgresql.org/message-id/CA+TgmoZfSXZhS6qy4Z0786D7iU_AbhBVPQFwLthpSvGieczqHg@mail.gmail.com
>> [3] -
http://www.postgresql.org/message-id/CA+TgmoYJETgeAXUsZROnA7BdtWzPtqExPJNTV1GKcaVMgSdhug@mail.gmail.com
>
>
> Okay, with my pgbench_accounts partitioned into 300, I ran:
>
> SELECT DISTINCT bid FROM pgbench_accounts;
>
> The query never returns,

You seem to be hitting the issue I have pointed in near-by thread [1]
and I have mentioned the same while replying on assess-parallel-safety
thread. Can you check after applying the patch in mail [1]

> and I also get this:
>
> grep -r 'starting background worker process "parallel worker for PID
12165"' postgresql-2015-03-25_112522.log | wc -l
> 2496
>
> 2,496 workers? This is with parallel_seqscan_degree set to 8. If I set
it to 2, this number goes down to 626, and with 16, goes up to 4320.
>
..
>
> Still not sure why 8 workers are needed for each partial scan. I would
expect 8 workers to be used for 8 separate scans. Perhaps this is just my
misunderstanding of how this feature works.
>

The reason is that for each table scan, it tries to use workers
equal to parallel_seqscan_degree if they are available and in this
case as the scan for inheritance hierarchy (tables in hierarchy) happens
one after another, it uses 8 workers for each scan. I think as of now
the strategy to decide number of workers to be used in scan is kept
simple and in future we can try to come with some better mechanism
to decide number of workers.

[1] -
http://www.postgresql.org/message-id/CAA4eK1+NwUJ9ik61yGfZBcN85dQuNEvd38_h1zngCdZrGLGQTQ@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2015-03-25 16:02:44 Re: Error with index on unlogged table
Previous Message Fabrízio de Royes Mello 2015-03-25 15:46:25 Re: Error with index on unlogged table