Skip site navigation (1) Skip section navigation (2)

Re: Don't Thread On Me (PostgreSQL related)

From: Eduardo Morras <nec556(at)retena(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Don't Thread On Me (PostgreSQL related)
Date: 2012-01-27 09:28:01
Message-ID: 4EFDA3B500566D8D@ (view raw or flat)
Thread:
Lists: pgsql-general
At 00:32 27/01/2012, you wrote:

>There are cases where intraquery parallelism would be helpful.  As 
>far as I understand it, PostgreSQL is the only major, solid (i.e. 
>excluding MySQL) RDBMS which does not offer some sort of intraquery 
>parallelism, and when running queries across very large databases, 
>it might be helpful to be able to, say, scan different partitions 
>simultaneously using different threads.  So I think it is wrong to 
>simply dismiss the need out of hand.  The thing though is that I am 
>not sure that where this need really comes to the fore, it is 
>typical of single-server instances, and so this brings me to the 
>bigger question.
>
>The question in my mind though is a more basic one:  How should 
>intraquery parallelism be handled?  Is it something PostgreSQL needs 
>to do or is it something that should be the work of an external 
>project like Postgres-XC?  Down the road is there value in merging 
>the codebases, perhaps making stand-alone/data/coordination node a 
>compile time option?

I still don't think threads are the solution for this scenary. You 
can do intraquery parallelism with multiprocess easier and safer than 
with multithread. You launch a process with the whole query, it 
divide the work in chunks and assigns them to different process 
instead of threads. You can use shared resources for communicattion 
between process. When all work is done, they pass results to the 
original process and it join them. The principal advantage doing it 
with process is that if one of the child subprocess dies, it can be 
killed/slained and relaunched without any damage to the work of the 
other brothers, but if you use threads, the whole process and all the 
work done is lost.

It's not the unique advantage of using process vs threads. Some years 
ago, one of the problems on multi socket servers was with the shared 
memory and communications between the sockets. The inter cpu speed 
was too much slow and latency too much high. Now, we have multi cpus 
in one socket and faster intersocket communications and this is not a 
problem anymore. Even better, the speed and latency communicating 2 
or more servers (not sockets or cpus) is reaching levels where a 
postgresql could have a shared memory between them, for example using 
Hypertransport cards or modern FC, and it's easier, lot easier, 
launch a remote process than a remote thread.


>Obviously such is not a question that needs to be addressed now.  We 
>can wait until someone has something that is production-ready and 
>relatively feature-complete before discussing merging projects.
>
>Best Wishes,
>Chris Travers



In response to

pgsql-general by date

Next:From: Chris TraversDate: 2012-01-27 11:03:48
Subject: Re: Don't Thread On Me (PostgreSQL related)
Previous:From: Magnus HaganderDate: 2012-01-27 09:09:21
Subject: Re: Don't Thread On Me (PostgreSQL related)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group