| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | David Geier <geidav(dot)pg(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, dilipbalaut(at)gmail(dot)com |
| Subject: | Re: Performance issues with parallelism and LIMIT |
| Date: | 2025-11-14 23:00:06 |
| Message-ID: | a371afc5-eb20-48d2-b6f3-01f976f9f922@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 11/14/25 19:20, David Geier wrote:
> Hi Tomas!
>
> On 13.11.2025 23:36, Tomas Vondra wrote:
>> ...
>>
>> Unfortunately, I don't think this patch is the way to go. When I apply
>> it, I get:
>>
>> ERROR: InstrEndLoop called on running node
>> CONTEXT: parallel worker
>>
>
> Ooops. That can likely be fixed.
>
>> And I very much doubt inventing a new ad hoc way to signal workers is
>> the right solution (even if there wasn't the InstrEndLoop issue).
>>
>> What I think we should do is much simpler - make the threshold in shm_mq
>> dynamic, start with a very low value and gradually ramp up (up to 1/4).
>> So we'd have
>>
>> if (mqh->mqh_consume_pending > threshold)
>>
>> We might start with
>>
>> threshold = (mq->mq_ring_size / 1024)
>>
>> or maybe some fixed value, list
>>
>> thredhold = 128
>>
>> And on every signal we'd double it, capping it to 1/4 of mq_ring_size.
>>
>> threshold = Min(threshold * 2, mq->mq_ring_size / 1024);
>>
>> This is very similar to other places doing this gradual ramp up, like in
>> the prefetching / read_stream, etc. It allows fast termination for low
>> LIMIT values, but quickly amortizes the cost for high LIMIT values.
>
> That's a different problem though, isn't it? The original thread
> contained two problems: (1) signaling the queue to late and (2) workers
> stopping to late in the presence of LIMIT if they're not finding any
> rows in their portion of the data.
>
> Changing the queue thresholds is a solution for (1) but not for (2). For
> (2) we need a mechanism to instruct the parallel workers to stop when we
> find that other parallel workers have already produced enough rows to
> answer the query.
>
Good point, I completely forgot about (2).
> An alternative mechanism that might work is using some stop_worker
> boolean in shared memory that we check in CHECK_FOR_INTERRUPTS().
> stop_worker is set to true by the leader as soon as it has collected
> enough tuples. But then CHECK_FOR_INTERRUPTS() would have to have access
> to the parallel context, which might also be a bit ugly.
>
Hmmm, yeah. We already do have shared state for the parallel scan. Do
you think we could maybe integrate that into that? So the scan would
just "finished" for all the workers ...
regards
--
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Thomas Munro | 2025-11-15 00:06:42 | Type assertions without GCC builtins |
| Previous Message | Daniel Gustafsson | 2025-11-14 22:57:53 | Re: Uncommented GUC in postgresql.conf.sample |