Re: Duplicate Workers entries in some EXPLAIN plans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Georgios Kokolatos <gkokolatos(at)pm(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Duplicate Workers entries in some EXPLAIN plans
Date: 2020-01-26 23:00:21
Message-ID: 18781.1580079621@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
>> I wonder if we could introduce a debug GUC that makes parallel worker
>> acquisition just retry in a loop, for a time determined by the GUC. That
>> obviously would be a bad idea to do in a production setup, but it could
>> be good enough for regression tests? There are some deadlock dangers,
>> but I'm not sure they really matter for the tests.

> Hmmm .... might work. Seems like a better idea than "run it by itself"
> as we have to do now.

The more I think about this, the more it seems like a good idea, and
not only for regression test purposes. If you're about to launch a
query that will run for hours even with the max number of workers,
you don't want it to launch with less than that number just because
somebody else was eating a worker slot for a few milliseconds.

So I'm imagining a somewhat general-purpose GUC defined like
"max_delay_to_acquire_parallel_worker", measured say in milliseconds.
The default would be zero (current behavior: try once and give up),
but you could set it to small positive values if you have that kind
of production concern, while the regression tests could set it to big
positive values. This would alleviate all sorts of problems we have
with not being able to assume stable results from parallel worker
acquisition in the tests.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-01-27 00:03:04 Re: Parallel leader process info in EXPLAIN
Previous Message Tom Lane 2020-01-26 22:53:09 Re: EXPLAIN's handling of output-a-field-or-not decisions