Re: Parallel Scaling of a pgplsql problem

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Venki Ramachandran <venki_ramachandran(at)yahoo(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Parallel Scaling of a pgplsql problem
Date: 2012-04-26 06:49:12
Message-ID: 4F98EFE8.5090609@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2012-04-26 04:40, Venki Ramachandran wrote:
> Thanks Tom, clock_timestamp() worked. Appreciate it!!! and Sorry was
> hurrying to get this done at work and hence did not read through.
>
> Can you comment on how you would solve the original problem? Even if I
> can get the 11 seconds down to 500 ms for one pair, running it for
> 300k pairs will take multiple hours. How can one write a combination
> of a bash script/pgplsql code so as to use all 8 cores of a server. I
> am seeing that this is just executing in one session/process.

You want to compare a calculation on the cross product 'employee x
employee'. If employee is partitioned into emp1, emp2, ... emp8, the
cross product is equal to the union of emp1 x employee, emp2 x employee,
.. emp8 x employee. Each of these 8 cross products on partitions can be
executed in parallel. I'd look into dblink to execute each of the 8
cross products in parallel, and then union all of those results.

http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.html

regards,
Yeb

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Spiegelberg 2012-04-26 16:13:36 Re: Parallel Scaling of a pgplsql problem
Previous Message Jan Nielsen 2012-04-26 03:41:13 Re: Parallel Scaling of a pgplsql problem