## Re: Parallel Scaling of a pgplsql problem

From: Samuel Gendler Venki Ramachandran "pgsql-performance(at)postgresql(dot)org" Re: Parallel Scaling of a pgplsql problem 2012-04-25 19:36:25 CAEV0TzAVQXf42n9vXu26JAe0-eQ9AgwwAErYFn0HMm8ScQaodA@mail.gmail.com (view raw or whole thread) 2012-04-25 18:52:03 from Venki Ramachandran  2012-04-25 19:36:25 from Samuel Gendler   2012-04-25 21:12:03 from Venki Ramachandran    2012-04-25 21:26:22 from Pavel Stehule     2012-04-25 21:45:38 from Venki Ramachandran      2012-04-25 21:52:08 from Tom Lane       2012-04-26 02:40:18 from Venki Ramachandran        2012-04-26 06:49:12 from Yeb Havinga  2012-04-25 20:04:45 from "Kevin Grittner"  2012-04-25 20:09:26 from Pavel Stehule  2012-04-25 22:05:43 from Merlin Moncure  2012-04-26 03:41:13 from Jan Nielsen  2012-04-26 16:13:36 from Greg Spiegelberg pgsql-performance
```On Wed, Apr 25, 2012 at 11:52 AM, Venki Ramachandran <
venki_ramachandran(at)yahoo(dot)com> wrote:

> Hi all:
> Can someone please guide me as to how to solve this problem? If this is
> the wrong forum, please let me know which one to post this one in. I am new
> to Postgres (about 3 months into it)
>
> I have PostGres 9.0 database in a AWS server (x-large) and a pgplsql
> program that does some computation. It takes in a date range and for one
> pair of personnel (two employees in a company) it calculates some values
> over the time period. It takes about 40ms (milli seconds) to complete and
> give me the answer. All good so far.
>
> Now I have to run the same pgplsql on all possible combinations of
> employees and with 542 employees that is about say 300,000 unique pairs.
>
> So (300000 * 40)/(1000 * 60 * 60) = 3.33 hours and I have to rank them and
> show it on a screen. No user wants to wait for 3 hours,  they can probably
> wait for 10 minutes (even that is too much for a UI application). How do I
> solve this scaling problem? Can I have multiple parellel sessions and each
> session have multiple/processes that do a pair each at 40 ms and then
> collate the results. Does PostGres or pgplsql have any parallel computing
> capability.
>

The question is, how much of that 40ms is spent performing the calculation,
how much is spent querying, and how much is function call overhead, and how
much is round trip between the client and server with the query and
results?  Depending upon the breakdown, it is entirely possible that the
actual per-record multiplier can be kept down to a couple of milliseconds
if you restructure things to query data in bulk and only call a single
function to do the work.  If you get it down to 4ms, that's a 20 minute
query.  Get it down to 1ms and you're looking at only 5 minutes for what
would appear to be a fairly compute-intensive report over a relatively
large dataset.
```

### pgsql-performance by date

 Next: From: Kevin Grittner Date: 2012-04-25 20:04:45 Subject: Re: Parallel Scaling of a pgplsql problem Previous: From: Robert Klemme Date: 2012-04-25 18:55:09 Subject: Re: Configuration Recommendations