Re: OpenMP in PostgreSQL-8.4.0

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Reydan Cankur <reydan(dot)cankur(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: OpenMP in PostgreSQL-8.4.0
Date: 2009-11-29 15:34:06
Message-ID: 407d949e0911290734t2ae00531icbc7002e469eadd5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, Nov 29, 2009 at 1:24 PM, Reydan Cankur <reydan(dot)cankur(at)gmail(dot)com> wrote:
> So I am trying to understand that can anyone rewrite some functions in
> postgresql with OpenMP in order to increase performance.
> does this work?

Well you have to check the code path you're parallelizing for any
function calls which might manipulate any data structures and protect
those data structures with locks. That will be a huge job and
introduce extra overhead. If you try to find code which does nothing
like that you'll be limited to a few low-level pieces of code because
Postgres goes to great lengths to be generic and allow
user-configurable code in lots of places. To give one example, the
natural place to introduce parallelism would be in the sorting
routines -- but the comparison routine is a data-type-specific
function that users can specify at the SQL level and is allowed to do
almost anything.

Then you'll have to worry about things like signal handlers. Anything
big enough to be worth parallelizing is going to have a
CHECK_FOR_INTERRUPTS in it which you'll have to make sure gets
received by and processed correctly, cancelling all threads and
throwing an error properly.

Come to think of it you'll have to handle PG_TRY() and PG_THROW()
properly. That will mean if an error occurs in any thread you have to
make sure that you kill all the threads that have been spawned in that
PG_TRY block and throw the correct error up.

Incidentally I doubt heap_deformtuple is suitable for parallelization.
It loops over the tuple and the procesing for each field depends
completely on the previous one. When you have that kind of chained
dependency adding threads doesn't help. You need a loop somewhere
where each iteration of the loop can be processed independently. You
might find such loops in the executor for things like hash joins or
nested loops. But they will definitely involve user-defined functions
and even i/o for each iteration of the loop so you'll definitely have
to take precautions against the usual multi-threading dangers.

--
greg

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ing . Marcos Luís Ortíz Valmaseda 2009-11-29 21:42:24 Any have tested ZFS like PostgreSQL installation filesystem?
Previous Message Tom Lane 2009-11-29 14:52:15 Re: OpenMP in PostgreSQL-8.4.0