parallelizing subplan execution (was: explain and PARAM_EXEC)

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: parallelizing subplan execution (was: explain and PARAM_EXEC)
Date: 2010-02-20 13:31:01
Message-ID: m2tytc3uga.fsf_-_@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sat, Feb 20, 2010 at 6:57 AM, Dimitri Fontaine
> <dfontaine(at)hi-media(dot)com> wrote:
>> How much does this stuff is dependent on the current state of the
>> backend?
>
> A whole lot.

Bad news.

>> Ok that's a far stretch from the question at hand, but would that be a
>> plausible approach to have parallel queries in PostgreSQL ?
>
> This is really a topic for another thread, but at 100,000 feet it
> seems to me that the hardest question is - how will you decide which
> operations to parallelize in the first place? Actually making it
> happen is really hard, too, of course, but even to get that that point
> you have to have some model for what types of operations it makes
> sense to parallelize and how you're going to decide when it's a win.

My naive thoughts would be to add some cost parameters. The fact to
fork() another backend first, then model for each supported subplan (we
will want to add more, or maybe have a special rendez-vous-materialise
node) some idea of the data exchange cost.

Now the planner would as usual try to find the less costly plan, and
will be able to compare plans with and without distributing the work.

Overly naive ?

Regards,
--
dim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-02-20 15:11:32 Re: explain and PARAM_EXEC
Previous Message Robert Haas 2010-02-20 13:11:59 Re: explain and PARAM_EXEC