Re: Performance issues with parallelism and LIMIT

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, dilipbalaut(at)gmail(dot)com
Subject: Re: Performance issues with parallelism and LIMIT
Date: 2025-11-18 15:07:04
Message-ID: 0e9fbe28-bb7b-41fc-a505-a34a414f12fe@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas!

On 18.11.2025 15:59, Tomas Vondra wrote:
>>
>> Some clarifications: I'm not inventing a new way to signal workers but
>> I'm using the existing SendProcSignal() machinery to inform parallel
>> workers to stop. I just added another signal PROCSIG_PARALLEL_STOP and
>> the corresponding functions to handle it from ProcessInterrupts().
>>
>
> Sure, but I still don't quite see the need to do all this.
>
>> What is "new" is how I'm stopping the parallel workers once they've
>> received the stop signal: the challenge is that the workers need to
>> actually jump out of whatever they are doing - even if they aren't
>> producing any rows at this point; but e.g. are scanning a table
>> somewhere deep down in ExecScan() / SeqNext().
>>
>> The only way I can see to make this work, without a huge patch that adds
>> new code all over the place, is to instruct process termination from
>> inside ProcessInterrupts(). I'm siglongjmp-ing out of the ExecutorRun()
>> function so that all parallel worker cleanup code still runs as if the
>> worker processed to completion. I've tried to end the process without
>> but that caused all sorts of fallout (instrumentation not collected,
>> postmaster thinking the process stopped unexpectedly, ...).
>>
>> Instead of siglongjmp-ing we could maybe call some parallel worker
>> shutdown function but that would require access to the parallel worker
>> state variables, which are currently not globally accessible.
>>
>
> But why? The leader and workers already share state - the parallel scan
> state (for the parallel-aware scan on the "driving" table). Why couldn't
> the leader set a flag in the scan, and force it to end in workers? Which
> AFAICS should lead to workers terminating shortly after that.
>
> All the code / handling is already in place. It will need a bit of new
> code in the parallel scans, but but not much I think.
>

But this would only work for the SeqScan case, wouldn't it? The parallel
worker might equally well be executing other code which doesn't produce
tuples, such as parallel index scan, a big sort, building a hash table, etc.

I thought this is not a viable solution because it would need changes in
all these places.

--
David Geier

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 河田達也 2025-11-18 15:14:38 Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE
Previous Message Tomas Vondra 2025-11-18 14:59:16 Re: Performance issues with parallelism and LIMIT