| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | David Geier <geidav(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, dilipbalaut(at)gmail(dot)com |
| Subject: | Re: Performance issues with parallelism and LIMIT |
| Date: | 2025-11-18 14:59:16 |
| Message-ID: | 95abfd87-d0f3-43e1-b4c2-b97645360e04@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 11/18/25 15:06, David Geier wrote:
> Hi Tomas!
>
> On 15.11.2025 00:00, Tomas Vondra wrote:
>> On 11/14/25 19:20, David Geier wrote:
>>>
>>> Ooops. That can likely be fixed.
>>>
>
> I'll take a look at why this happens the next days, if you think this
> approach generally has a chance to be accepted. See below.
>
>>>> And I very much doubt inventing a new ad hoc way to signal workers is
>>>> the right solution (even if there wasn't the InstrEndLoop issue).
>>>>
>>
>> Good point, I completely forgot about (2).
>>
>
> In that light, could you take another look at my patch?
>
> Some clarifications: I'm not inventing a new way to signal workers but
> I'm using the existing SendProcSignal() machinery to inform parallel
> workers to stop. I just added another signal PROCSIG_PARALLEL_STOP and
> the corresponding functions to handle it from ProcessInterrupts().
>
Sure, but I still don't quite see the need to do all this.
> What is "new" is how I'm stopping the parallel workers once they've
> received the stop signal: the challenge is that the workers need to
> actually jump out of whatever they are doing - even if they aren't
> producing any rows at this point; but e.g. are scanning a table
> somewhere deep down in ExecScan() / SeqNext().
>
> The only way I can see to make this work, without a huge patch that adds
> new code all over the place, is to instruct process termination from
> inside ProcessInterrupts(). I'm siglongjmp-ing out of the ExecutorRun()
> function so that all parallel worker cleanup code still runs as if the
> worker processed to completion. I've tried to end the process without
> but that caused all sorts of fallout (instrumentation not collected,
> postmaster thinking the process stopped unexpectedly, ...).
>
> Instead of siglongjmp-ing we could maybe call some parallel worker
> shutdown function but that would require access to the parallel worker
> state variables, which are currently not globally accessible.
>
But why? The leader and workers already share state - the parallel scan
state (for the parallel-aware scan on the "driving" table). Why couldn't
the leader set a flag in the scan, and force it to end in workers? Which
AFAICS should lead to workers terminating shortly after that.
All the code / handling is already in place. It will need a bit of new
code in the parallel scans, but but not much I think.
regards
--
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Geier | 2025-11-18 15:07:04 | Re: Performance issues with parallelism and LIMIT |
| Previous Message | Nazir Bilal Yavuz | 2025-11-18 14:20:05 | Re: Speed up COPY FROM text/CSV parsing using SIMD |