Re: Multi CPU Queries - Feedback and/or suggestions wanted!

From: Myron Scott <lister(at)sacadia(dot)com>
To: Chuck McDevitt <cmcdevitt(at)greenplum(dot)com>
Cc: Jeffrey Baker <jwbaker(at)gmail(dot)com>, Julius Stroffek <Julius(dot)Stroffek(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org, Dano Vojtek <danielkov(at)gmail(dot)com>
Subject: Re: Multi CPU Queries - Feedback and/or suggestions wanted!
Date: 2008-10-21 22:50:33
Message-ID: D93A7122-F689-4FA9-9576-34EC7DFDE871@sacadia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I can confirm that bringing Postgres code to multi-thread implementation
requires quite a bit of ground work. I have been working for a long
while
with a Postgres 7.* fork that uses pthreads rather than processes.
The effort
to make all the subsystems thread safe took some time and touched
almost every section of the codebase.

I recently spent some time trying to optimize for Chip Multi-Threading
systems but focused more on total throughput rather than single query
performance. The biggest wins came from changing some coarse
grained locks in the page buffering system to a finer grained
implementation.

I also tried to improve single query performance by splitting index and
sequential scans into two threads, one to fault in pages and check tuple
visibility and the other for everything else. My success was limited
and
it was hard for me to work the proper costing into the query optimizer
so
that it fired at the right times.

One place that multiple threads really helped was in index building.

My code is poorly commented and the build system is a mess (I am only
building 64bit SPARC for embedding into another app). However, I am
using it in production and source is available if it's of any help.

http://weaver2.dev.java.net

Myron Scott

On Oct 20, 2008, at 11:28 PM, Chuck McDevitt wrote:

> There is a problem trying to make Postgres do these things in
> Parallel.
>
> The backend code isn’t thread-safe, so doing a multi-thread
> implementation requires quite a bit of work.
>
> Using multiple processes has its own problems: The whole way
> locking works equates one process with one transaction (The proc
> table is one entry per process). Processes would conflict on locks,
> deadlocking themselves, as well as many other problems.
>
> It’s all a good idea, but the work is probably far more than you
> expect.
>
> Async I/O might be easier, if you used pThreads, which is mostly
> portable, but not to all platforms. (Yes, they do work on Windows)
>
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-owner(at)postgresql(dot)org
> ] On Behalf Of Jeffrey Baker
> Sent: 2008-10-20 22:25
> To: Julius Stroffek
> Cc: pgsql-hackers(at)postgresql(dot)org; Dano Vojtek
> Subject: Re: [HACKERS] Multi CPU Queries - Feedback and/or
> suggestions wanted!
>
> On Mon, Oct 20, 2008 at 12:05 PM, Julius Stroffek <Julius(dot)Stroffek(at)sun(dot)com
> > wrote:
> Topics that seem to be of interest and most of them were already
> discussed at developers meeting in Ottawa are
> 1.) parallel sorts
> 2.) parallel query execution
> 3.) asynchronous I/O
> 4.) parallel COPY
> 5.) parallel pg_dump
> 6.) using threads for parallel processing
> [...]
> 2.)
> Different subtrees (or nodes) of the plan could be executed in
> parallel
> on different CPUs and the results of this subtrees could be requested
> either synchronously or asynchronously.
>
> I don't see why multiple CPUs can't work on the same node of a
> plan. For instance, consider a node involving a scan with an
> expensive condition, like UTF-8 string length. If you have four
> CPUs you can bring to bear, each CPU could take every fourth page,
> computing the expensive condition for each tuple in that page. The
> results of the scan can be retired asynchronously to the next node
> above.
>
> -jwb

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2008-10-21 22:54:20 Re: Regression in IN( field, field, field ) performance
Previous Message Hannu Krosing 2008-10-21 22:32:01 Re: Withdraw PL/Proxy from commitfest