Re: Choosing parallel_degree

From: James Sewell <james(dot)sewell(at)lisasoft(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Choosing parallel_degree
Date: 2016-03-15 02:24:08
Message-ID: CANkGpBvJi+EwYEuj2SNczbgDp3APF702+iZ33_7scHD5MfYsTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks David,

Eventually it would be great to take into account the cost of the function
doing the agg (pg_proc.procost, which is a multiple of CPU units).

This would allow people to mark specific aggregations as needing more CPU
power, therefore needing more workers per page (or should it be tuple in
this case?).

In the meantime some way to manually influence this would be good. I just
did some testing (on an 8VCPU machine) with a 139MB table, which gets 3
workers currently.

For a count(*) I get a time of 131.754 ms. If I increase this to 8 workers
I get around 86.193 ms.

Obviously this doesn't mean much as YMMV - but it does show that the
ability to manually adjust the scaling would be great, especially in
data warehouse or reporting environments.

I did want to test with some really slow aggs, but even when I take out the
small table test in create_parallel_paths I can't seem to get a parallel
plan for a tiny table. Any idea on why this would be David?

Cheers,

James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________

Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000 *W* www.lisasoft.com *F *(+61) 3 8370 8099

On Tue, Mar 15, 2016 at 12:25 PM, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com
> wrote:

> Over in [1] James mentioned about wanting more to be able to have more
> influence over the partial path's parallel_degree decision. At risk
> of a discussion on that hijacking the parallel aggregate thread, I
> thought I'd start this for anyone who would want to discuss making
> changes to that.
>
> I've attached a simple C program which shows the parallel_degree which
> will be chosen at the moment. For now it's based on the size of the
> base relation. Perhaps that will need to be rethought later, perhaps
> based on costs. But I just don't think it's something for 9.6.
>
> Here's the output of the C program.
>
> For 1 pages there will be 1 workers (rel size 0 MB, 0 GB)
> For 3001 pages there will be 2 workers (rel size 23 MB, 0 GB)
> For 9001 pages there will be 3 workers (rel size 70 MB, 0 GB)
> For 27001 pages there will be 4 workers (rel size 210 MB, 0 GB)
> For 81001 pages there will be 5 workers (rel size 632 MB, 0 GB)
> For 243001 pages there will be 6 workers (rel size 1898 MB, 1 GB)
> For 729001 pages there will be 7 workers (rel size 5695 MB, 5 GB)
> For 2187001 pages there will be 8 workers (rel size 17085 MB, 16 GB)
> For 6561001 pages there will be 9 workers (rel size 51257 MB, 50 GB)
> For 19683001 pages there will be 10 workers (rel size 153773 MB, 150 GB)
> For 59049001 pages there will be 11 workers (rel size 461320 MB, 450 GB)
> For 177147001 pages there will be 12 workers (rel size 1383960 MB, 1351 GB)
> For 531441001 pages there will be 13 workers (rel size 4151882 MB, 4054 GB)
> For 1594323001 pages there will be 14 workers (rel size 12455648 MB, 12163
> GB)
>
> [1]
> http://www.postgresql.org/message-id/CANkGpBtUvzpdvF2=_iQ64UjmVrPYcS6d4i9-wepbUsq1sq+AWw@mail.gmail.com
>
> --
> David Rowley http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

--

------------------------------
The contents of this email are confidential and may be subject to legal or
professional privilege and copyright. No representation is made that this
email is free of viruses or other defects. If you have received this
communication in error, you may not copy or distribute any part of it or
otherwise disclose its contents to anyone. Please advise the sender of your
incorrect receipt of this correspondence.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-15 02:30:05 Re: Minor bug affecting ON CONFLICT lock wait log messages
Previous Message Noah Misch 2016-03-15 02:04:41 Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system