Re: Multithread Query Planner

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Christopher Browne <cbbrowne(at)gmail(dot)com>, Frederico <zepfred(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multithread Query Planner
Date: 2012-01-24 16:00:15
Message-ID: CA+TgmoYD7ff14DE2ftH75m++2R-p8SnS3E=xece0jobAVheYxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 2:45 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> Yes, but OP is proposing to use multiple threads inside the forked
> execution process.  That's a completely different beast.  Many other
> databases support parallel execution of a single query and it might
> very well be better/easier to do that with threads.

I doubt it. Almost nothing in the backend is thread-safe. You can't
acquire a heavyweight lock, a lightweight lock, or a spinlock. You
can't do anything that might elog() or ereport(). None of those
things are reentrant. Consequently, you can't do anything that
involves reading or pinning a buffer, making a syscache lookup, or
writing WAL. You can't even do something like parallelize the
qsort() of a chunk of data that's already been read into a private
buffer... because you'd have to call the comparison functions for the
data type, and they might elog() or ereport(). Of course, in certain
special cases (like int4) you could make it safe, but it's hard for to
imagine anyone wanting to go to that amount of effort for such a small
payoff.

If we're going to do parallel query in PG, and I think we are going to
need to do that eventually, we're going to need a system where large
chunks of work can be handed off, as in the oft-repeated example of
parallelizing an append node by executing multiple branches
concurrently. That's where the big wins are. And that means either
overhauling the entire backend to make it thread-safe, or using
multiple backends. The latter will be hard, but it'll still be a lot
easier than the former.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2012-01-24 16:16:02 Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements
Previous Message Simon Riggs 2012-01-24 15:15:31 Re: Page Checksums