Re: Parallel Seq Scan

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Jeff Davis <pgsql(at)j-davis(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-07-27 04:41:56
Message-ID: CAJrrPGdYaKkFLQFpWkdO=4HYhSj7Zt_10Y+M-oXzydAfuw9p5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 23, 2015 at 9:42 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Jul 22, 2015 at 9:14 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>> One thing I noticed that is a bit dismaying is that we don't get a lot
>> of benefit from having more workers. Look at the 0.1 data. At 2
>> workers, if we scaled perfectly, we would be 3x faster (since the
>> master can do work too), but we are actually 2.4x faster. Each
>> process is on the average 80% efficient. That's respectable. At 4
>> workers, we would be 5x faster with perfect scaling; here we are 3.5x
>> faster. So the third and fourth worker were about 50% efficient.
>> Hmm, not as good. But then going up to 8 workers bought us basically
>> nothing.
>>
>
> I think the improvement also depends on how costly is the qualification,
> if it is costly, even for same selectivity the gains will be shown till
> higher
> number of clients and for simple qualifications, we will see that cost of
> having more workers will start dominating (processing data over multiple
> tuple queues) over the benefit we can achieve by them.

Yes, That's correct. when the qualification cost is increased, the performance
is also increasing with number of workers.

Instead of using all the configured workers per query, how about deciding number
of workers based on cost of the qualification? I am not sure whether we have
any information available to find out the qualification cost. This way
the workers
will be distributed to all backends properly.

Regards,
Hari Babu
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2015-07-27 05:00:02 spgist recovery assertion failure
Previous Message Craig Ringer 2015-07-27 03:55:35 Re: raw output from copy