Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-02-08 03:36:07
Message-ID: CAA4eK1LtJLJg+x2zZonMO5TNYs0DP-_fXMC0xBuwC1RTDMACZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 8, 2015 at 3:46 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Sat, Feb 7, 2015 at 4:30 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> > On 2015-02-06 22:57:43 -0500, Robert Haas wrote:
> >> On Fri, Feb 6, 2015 at 2:13 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> >> > My first comment here is that I think we should actually teach
> >> > heapam.c about parallelism.
> >>
> >> I coded this up; see attached. I'm also attaching an updated version
> >> of the parallel count code revised to use this API. It's now called
> >> "parallel_count" rather than "parallel_dummy" and I removed some
> >> stupid stuff from it. I'm curious to see what other people think, but
> >> this seems much cleaner to me. With the old approach, the
> >> parallel-count code was duplicating some of the guts of heapam.c and
> >> dropping the rest on the floor; now it just asks for a parallel scan
> >> and away it goes. Similarly, if your parallel-seqscan patch wanted to
> >> scan block-by-block rather than splitting the relation into equal
> >> parts, or if it wanted to participate in the synchronized-seqcan
> >> stuff, there was no clean way to do that. With this approach, those
> >> decisions are - as they quite properly should be - isolated within
> >> heapam.c, rather than creeping into the executor.
> >
> > I'm not convinced that that reasoning is generally valid. While it may
> > work out nicely for seqscans - which might be useful enough on its own -
> > the more stuff we parallelize the *more* the executor will have to know
> > about it to make it sane. To actually scale nicely e.g. a parallel sort
> > will have to execute the nodes below it on each backend, instead of
> > doing that in one as a separate step, ferrying over all tuples to
> > indivdual backends through queues, and only then parallezing the
> > sort.
> >
> > Now. None of that is likely to matter immediately, but I think starting
> > to build the infrastructure at the points where we'll later need it does
> > make some sense.

I think doing it for parallel seq scan as well makes the processing for
worker much more easier like processing for prepared queries
(bind parameters), processing of Explain statement, Qualification,
Projection, decision for processing of junk entries.

>
> Well, I agree with you, but I'm not really sure what that has to do
> with the issue at hand. I mean, if we were to apply Amit's patch,
> we'd been in a situation where, for a non-parallel heap scan, heapam.c
> decides the order in which blocks get scanned, but for a parallel heap
> scan, nodeParallelSeqscan.c makes that decision.

I think other places also decides about the order/way heapam.c has
to scan, example the order in which rows/pages has to traversed is
decided at portal/executor layer and the same is passed till heap and
in case of index, the scanlimits (heap_setscanlimits()) are decided
outside heapam.c and something similar is done for parallel seq scan.
In general, the scan is driven by Scandescriptor which is constructed
at upper level and there are some API's exposed to derive the scan.
If you are not happy with the current way nodeParallelSeqscan has
set the scan limits, we can have some form of callback which do the
required work and this callback can be called from heapam.c.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-02-08 05:54:16 Re: Patch: add recovery_timeout option to control timeout of restore_command nonzero status code
Previous Message Amit Kapila 2015-02-08 03:04:05 Re: Parallel Seq Scan