Skip site navigation (1) Skip section navigation (2)

Re: PostgreSQL Parallel Processing !

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: sridhar bamandlapally <sridhar(dot)bn1(at)gmail(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>, Venkat Balaji <venkat(dot)balaji(at)verse(dot)in>, ashish nauriyal <anauriyal(at)gmail(dot)com>
Subject: Re: PostgreSQL Parallel Processing !
Date: 2012-01-25 13:43:23
Message-ID: CAGTBQpbm2uSfsQ=B-ASH36dKMknnoYmJFwe+OmSqjX6X_vvgUg@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Wed, Jan 25, 2012 at 6:18 AM, sridhar bamandlapally
<sridhar(dot)bn1(at)gmail(dot)com> wrote:
> I just want to illustrate an idea may possible for bringing up
> parallel process in PostgreSQL at SQL-Query level
>
> The PARALLEL option in Oracle really give great improvment in
> performance, multi-thread concept has great possibilities
>
> In Oracle we have hints ( see below ) :
> SELECT /*+PARALLEL( e, 2 )*/ e.* FROM EMP e ;
>
> PostgreSQL ( may if possible in future ) :
> SELECT e.* FROM EMP PARALLEL ( e, 2) ;

It makes little sense (and is contrary to pg policy of no hinting) to
do it like that.

In fact, I've been musing for a long time on leveraging pg's
sophisticated planner to do the parallelization:
 * Synchroscan means whenever a table has to be scanned twice, it can
be done with two threads.
 * Knowing whether a scan will hit mostly disk or memory can help in
deciding whether to do them in parallel or not (memory can be
parallelized, interleaved memory access isn't so bad, but interleaved
disk access is disastrous)
 * Big sorts can be parallelized quite easily
 * Number of threads to use can be a tunable or automatically set to
the number of processors on the system
 * Pipelining is another useful plan transformation: parallelize
I/O-bound nodes with CPU-bound ones.

I know squat about how to implement this, but I've been considering
picking the low hanging fruit on that tree and patching up PG to try
the concept. Many of the items above would require a thread-safe
execution engine, which may be quite hard to get and have a
significant performance hit. Some don't, like parallel sort.

Also, it is necessary to notice that parallelization will create some
priority inversion issues. Simple, non-parallelizable queries will
suffer from resource starvation when contending against more complex,
parallelizable ones.

In response to

Responses

pgsql-performance by date

Next:From: sridhar bamandlapallyDate: 2012-01-25 16:18:43
Subject: Re: PostgreSQL Parallel Processing !
Previous:From: sridhar bamandlapallyDate: 2012-01-25 09:18:49
Subject: Re: PostgreSQL Parallel Processing !

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group