Re: Parallel Seq Scan

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-11-05 05:52:55
Message-ID: CAJrrPGcncrq9V_B+_1a20XdN2YojhcqiQf=5gV1JKPGq5euG9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 3, 2015 at 9:41 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Oct 23, 2015 at 4:41 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
>>
>> On Fri, Oct 23, 2015 at 10:33 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
>> wrote:
>
> Please find the rebased partial seq scan patch attached with this
> mail.
>
> Robert suggested me off list that we should once try to see if we
> can use Seq Scan node instead of introducing a new Partial Seq Scan
> node. I have analyzed to see if we can use the SeqScan node (containing
> parallel flag) instead of introducing new partial seq scan and found that
> we primarily need to change most of the functions in nodeSeqScan.c to
> have a parallel flag check and do something special for Partial Seq Scan
> and apart from that we need special handling in function
> ExecSupportsBackwardScan(). In general, I think we can make
> SeqScan node parallel-aware by having some special paths without
> introducing much complexity and that can save us code-duplication
> between nodeSeqScan.c and nodePartialSeqScan.c. One thing that makes
> me slightly uncomfortable with this approach is that for partial seq scan,
> currently the plan looks like:
>
> QUERY PLAN
> --------------------------------------------------------------------------
> Gather (cost=0.00..2588194.25 rows=9990667 width=4)
> Number of Workers: 1
> -> Partial Seq Scan on t1 (cost=0.00..89527.51 rows=9990667 width=4)
> Filter: (c1 > 10000)
> (4 rows)
>
> Now instead of displaying Partial Seq Scan, if we just display Seq Scan,
> then it might confuse user, so it is better to add some thing indicating
> parallel node if we want to go this route.

IMO, the change from Partial Seq Scan to Seq Scan may not confuse user,
if we clearly specify in the documentation that all plans under a Gather node
are parallel plans.

This is possible for the execution nodes that executes fully under a
Gather node.
The same is not possible for parallel aggregates, so we have to mention the
aggregate node below Gather node as partial only.

I feel this suggestion arises as may be because of some duplicate code between
Partial Seq Scan and Seq scan. By using Seq Scan node only if we display as
Partial Seq Scan by storing some flag in the plan? This avoids the
need of adding
new plan nodes.

Regards,
Hari Babu
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-11-05 06:00:51 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Previous Message Haribabu Kommi 2015-11-05 05:34:57 Re: NOTIFY in Background Worker