Re: Parallel Seq Scan

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-02-18 13:14:00
Message-ID: 20150218131400.GB16383@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-02-18 16:59:26 +0530, Amit Kapila wrote:
> On Tue, Feb 17, 2015 at 9:52 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
> wrote:
> > A query whose runetime is dominated by a sequential scan (+ attached
> > filter) is certainly going to require a bigger prefetch size than one
> > that does other expensive stuff.
> >
> > Imagine parallelizing
> > SELECT * FROM largetable WHERE col = low_cardinality_value;
> > and
> > SELECT *
> > FROM largetable JOIN gigantic_table ON (index_nestloop_condition)
> > WHERE col = high_cardinality_value;
> >
> > The first query will be a simple sequential and disk reads on largetable
> > will be the major cost of executing it. In contrast the second query
> > might very well sensibly be planned as a parallel sequential scan with
> > the nested loop executing in the same worker. But the cost of the
> > sequential scan itself will likely be completely drowned out by the
> > nestloop execution - index probes are expensive/unpredictable.

> I think the work/task given to each worker should be as granular
> as possible to make it more predictable.
> I think the better way to parallelize such a work (Join query) is that
> first worker does sequential scan and filtering on large table and
> then pass it to next worker for doing join with gigantic_table.

I'm pretty sure that'll result in rather horrible performance. IPC is
rather expensive, you want to do as little of it as possible.

> > >
> > > I think it makes sense to think of a set of tasks in which workers can
> > > assist. So you a query tree which is just one query tree, with no
> > > copies of the nodes, and then there are certain places in that query
> > > tree where a worker can jump in and assist that node. To do that, it
> > > will have a copy of the node, but that doesn't mean that all of the
> > > stuff inside the node becomes shared data at the code level, because
> > > that would be stupid.
> >
> > My only "problem" with that description is that I think workers will
> > have to work on more than one node - it'll be entire subtrees of the
> > executor tree.

> There could be some cases where it could be beneficial for worker
> to process a sub-tree, but I think there will be more cases where
> it will just work on a part of node and send the result back to either
> master backend or another worker for further processing.

I think many parallelism projects start out that way, and then notice
that it doesn't parallelize very efficiently.

The most extreme example, but common, is aggregation over large amounts
of data - unless you want to ship huge amounts of data between processes
eto parallize it you have to do the sequential scan and the
pre-aggregate step (that e.g. selects count() and sum() to implement a
avg over all the workers) inside one worker.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2015-02-18 13:17:37 Re: pgaudit - an auditing extension for PostgreSQL
Previous Message Andres Freund 2015-02-18 13:09:18 Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]