Quick Links

Re: Multi CPU Queries - Feedback and/or suggestions wanted!

From:	"Chuck McDevitt" <cmcdevitt(at)greenplum(dot)com>
To:	"Jeffrey Baker" <jwbaker(at)gmail(dot)com>, "Julius Stroffek" <Julius(dot)Stroffek(at)sun(dot)com>
Cc:	<pgsql-hackers(at)postgresql(dot)org>, "Dano Vojtek" <danielkov(at)gmail(dot)com>
Subject:	Re: Multi CPU Queries - Feedback and/or suggestions wanted!
Date:	2008-10-21 06:28:57
Message-ID:	EB48EBF3B239E948AC1E3F3780CF8F88044E92C8@MI8NYCMAIL02.Mi8.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

There is a problem trying to make Postgres do these things in Parallel.

The backend code isn't thread-safe, so doing a multi-thread
implementation requires quite a bit of work.

Using multiple processes has its own problems: The whole way locking
works equates one process with one transaction (The proc table is one
entry per process). Processes would conflict on locks, deadlocking
themselves, as well as many other problems.

It's all a good idea, but the work is probably far more than you expect.

Async I/O might be easier, if you used pThreads, which is mostly
portable, but not to all platforms. (Yes, they do work on Windows)

From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Jeffrey Baker
Sent: 2008-10-20 22:25
To: Julius Stroffek
Cc: pgsql-hackers(at)postgresql(dot)org; Dano Vojtek
Subject: Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions
wanted!

On Mon, Oct 20, 2008 at 12:05 PM, Julius Stroffek
<Julius(dot)Stroffek(at)sun(dot)com> wrote:

Topics that seem to be of interest and most of them were already
discussed at developers meeting in Ottawa are
1.) parallel sorts
2.) parallel query execution
3.) asynchronous I/O
4.) parallel COPY
5.) parallel pg_dump
6.) using threads for parallel processing

[...]

2.)
Different subtrees (or nodes) of the plan could be executed in
parallel
on different CPUs and the results of this subtrees could be
requested
either synchronously or asynchronously.

I don't see why multiple CPUs can't work on the same node of a plan.
For instance, consider a node involving a scan with an expensive
condition, like UTF-8 string length. If you have four CPUs you can
bring to bear, each CPU could take every fourth page, computing the
expensive condition for each tuple in that page. The results of the
scan can be retired asynchronously to the next node above.

-jwb

In response to

Re: Multi CPU Queries - Feedback and/or suggestions wanted! at 2008-10-21 05:25:16 from Jeffrey Baker

Responses

Re: Multi CPU Queries - Feedback and/or suggestions wanted! at 2008-10-21 22:50:33 from Myron Scott

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2008-10-21 07:05:42	Re: [GENERAL] [HACKERS] Hot Standby utility and administrator functions
Previous Message	tomas	2008-10-21 06:06:12	Re: [HACKERS] Debian no longer dumps cores?