Re: Parallel Seq Scan

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-01-09 21:15:08
Message-ID: 54B044DC.4070104@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/09/2015 08:01 PM, Stephen Frost wrote:
> Amit,
>
> * Amit Kapila (amit(dot)kapila16(at)gmail(dot)com) wrote:
>> On Fri, Jan 9, 2015 at 1:02 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>>> I agree, but we should try and warn the user if they set
>>> parallel_seqscan_degree close to max_worker_processes, or at least give
>>> some indication of what's going on. This is something you could end up
>>> beating your head on wondering why it's not working.
>>
>> Yet another way to handle the case when enough workers are not
>> available is to let user specify the desired minimum percentage of
>> requested parallel workers with parameter like
>> PARALLEL_QUERY_MIN_PERCENT. For example, if you specify
>> 50 for this parameter, then at least 50% of the parallel workers
>> requested for any parallel operation must be available in order for
>> the operation to succeed else it will give error. If the value is set to
>> null, then all parallel operations will proceed as long as at least two
>> parallel workers are available for processing.
>
> Ugh. I'm not a fan of this.. Based on how we're talking about modeling
> this, if we decide to parallelize at all, then we expect it to be a win.
> I don't like the idea of throwing an error if, at execution time, we end
> up not being able to actually get the number of workers we want-
> instead, we should degrade gracefully all the way back to serial, if
> necessary. Perhaps we should send a NOTICE or something along those
> lines to let the user know we weren't able to get the level of
> parallelization that the plan originally asked for, but I really don't
> like just throwing an error.

yeah this seems like the the behaviour I would expect, if we cant get
enough parallel workers we should just use as much as we can get.
Everything else and especially erroring out will just cause random
application failures and easy DoS vectors.
I think all we need initially is being able to specify a "maximum number
of workers per query" as well as a "maximum number of workers in total
for parallel operations".

>
> Now, for debugging purposes, I could see such a parameter being
> available but it should default to 'off/never-fail'.

not sure what it really would be useful for - if I execute a query I
would truely expect it to get answered - if it can be made faster if
done in parallel thats nice but why would I want it to fail?

Stefan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-01-09 21:25:13 Re: INSERT ... ON CONFLICT UPDATE and RLS
Previous Message Stephen Frost 2015-01-09 21:09:20 Re: INSERT ... ON CONFLICT UPDATE and RLS