Re: Choosing parallel_degree

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, James Sewell <james(dot)sewell(at)lisasoft(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andreas Ulbrich <andreas(dot)ulbrich(at)matheversum(dot)de>
Subject: Re: Choosing parallel_degree
Date: 2016-04-11 21:42:14
Message-ID: CANP8+jJThmrUTu6nAKbyrzeFjG86Nj0zYdL8n07Yc3Qr_cg3jQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 April 2016 at 17:27, Paul Ramsey <pramsey(at)cleverelephant(dot)ca> wrote:

> PostGIS is just one voice...
>

We're listening.

> >> Functions have very unequal CPU costs, and we're talking here about
> >> using CPUs more effectively, why are costs being given the see-no-evil
> >> treatment? This is as true in core as it is in PostGIS, even if our
> >> case is a couple orders of magnitude more extreme: a filter based on a
> >> complex combination of regex queries will use an order of magnitude
> >> more CPU than one that does a little math, why plan and execute them
> >> like they are the same?
> >
> > Functions have user assignable costs.
>
> We have done a relatively bad job of globally costing our functions
> thus far, because it mostly didn't make any difference. In my testing
> [1], I found that costing could push better plans for parallel
> sequence scans and parallel aggregates, though at very extreme cost
> values (1000 for sequence scans and 10000 for aggregates)
>
> Obviously, if costs can make a difference for 9.6 and parallelism
> we'll rigorously ensure we have good, useful costs. I've already
> costed many functions in my parallel postgis test branch [2]. Perhaps
> the avoidance of cost so far is based on the relatively nebulous
> definition it has: about the only thing in the docs is "If the cost is
> not specified, 1 unit is assumed for C-language and internal
> functions, and 100 units for functions in all other languages. Larger
> values cause the planner to try to avoid evaluating the function more
> often than necessary."
>
> So what about C functions then? Should a string comparison be 5 and a
> multiplication 1? An image histogram 1000?
>

We don't have a clear methodology for how to do this.

It's a single parameter to allow you to achieve the plans that work
optimally. Hopefully that is simple enough for everyone to use and yet
flexible enough to make a difference.

If its not what you need, show us and it may make the case for change.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2016-04-11 21:45:59 Re: Choosing parallel_degree
Previous Message Andres Freund 2016-04-11 21:40:29 Re: Move PinBuffer and UnpinBuffer to atomics