Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-10-15 07:32:12
Message-ID: CAA4eK1J3es5xddQqWsARtrhgOFmwzBPYN=OaOzayFg+nLmEQjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 14, 2015 at 3:29 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Oct 13, 2015 at 2:45 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > Attached is rebased patch for partial seqscan support.
>
> Review comments:
>
> - If you're going to pgindent execParallel.c, you need to add some
> entries to typedefs.list so it doesn't mangle the formatting.
> ExecParallelEstimate's parameter list is misformatted, for example.
> Also, I think if we're going to do this we had better extract the
> pgindent changes and commit those first. It's pretty distracting the
> way you have it.
>

Agreed, we can do those separately. So, I have reverted those changes.

> - Instead of inlining the work needed by each parallel mode in
> ExecParallelEstimate(), I think you should mimic the style of
> ExecProcNode and call a node-type specific function that is part of
> that node's public interface - here, ExecPartialSeqScanEstimate,
> perhaps. Similarly for ExecParallelInitializeDSM. Perhaps
> ExecPartialSeqScanInitializeDSM.
>
> - I continue to think GetParallelShmToc is the wrong approach.
> Instead, each time ExecParallelInitializeDSM or
> ExecParallelInitializeDSM calls a nodetype-specific initialized
> function (as described in the previous point), have it pass d->pcxt as
> an argument. The node can get the toc from there if it needs it. I
> suppose it could store a pointer to the toc in its scanstate if it
> needs it, but it really shouldn't. Instead, it should store a pointer
> to, say, the ParallelHeapScanDesc in the scanstate. It really should
> only care about its own shared data, so once it finds that, the toc
> shouldn't be needed any more. Then ExecPartialSeqScan doesn't need to
> look up pscan; it's already recorded in the scanstate.
>
> - ExecParallelInitializeDSMContext's new pscan_len member is 100%
> wrong. Individual scan nodes don't get to add stuff to that context
> object. They should store details like this in their own ScanState as
> needed.
>

Changed the patch as per above suggestions.

> - The positioning of the new nodes in various lists doesn't seem to
> entirely consistent. nodes.h adds them after SampleScan which isn't
> terrible, though maybe immediately after SeqScan would be better, but
> _outNode has it right after BitmapOr and the switch in _copyObject has
> it somewhere else again.
>

I think this got messed up while rebasing on top of Gather node
changes, but nonetheless, I have changed it such that PartialSeqScan
node handling is after SeqScan.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
parallel_seqscan_partialseqscan_v21.patch application/octet-stream 49.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2015-10-15 08:02:23 PATCH: 9.5 replication origins fix for logical decoding
Previous Message Amit Kapila 2015-10-15 07:08:32 Re: Union All View execution