Re: Performance issues with parallelism and LIMIT

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, dilipbalaut(at)gmail(dot)com
Subject: Re: Performance issues with parallelism and LIMIT
Date: 2025-11-18 18:35:32
Message-ID: 854c29cc-a535-4f28-b1b1-464080b25ef5@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 18.11.2025 18:31, Tomas Vondra wrote:
> On 11/18/25 17:51, Tom Lane wrote:
>> David Geier <geidav(dot)pg(at)gmail(dot)com> writes:
>>> On 18.11.2025 16:40, Tomas Vondra wrote:
>>>> It'd need code in the parallel-aware scans, i.e. seqscan, bitmap, index.
>>>> I don't think you'd need code in other plans, because all parallel plans
>>>> have one "driving" table.
>>
>> You're assuming that the planner will insert Gather nodes at arbitrary
>> places in the plan, which isn't true. If it does generate plans that
>> are problematic from this standpoint, maybe the answer is "don't
>> parallelize in exactly that way".
>>
>
> I think David has a point that nodes that "buffer" tuples (like Sort or
> HashAgg) would break the approach making this the responsibility of the
> parallel-aware scan. I don't see anything particularly wrong with such
> plans - plans with partial aggregation often look like that.
>
> Maybe this should be the responsibility of execProcnode.c, not the
> various nodes?
>

I like that idea, even though it would still not work while a node is
doing the crunching. That is after it has pulled all rows and before it
can return the first row. During this time the node won't call
ExecProcNode().

But that seems like an acceptable limitation. At least it keeps working
above "buffer" nodes.

I'll give this idea a try. Then we can contrast this approach with the
approach in my initial patch.

> It'd be nice to show this in EXPLAIN (that some of the workers were
> terminated early, before processing all the data).

Inspectability on that end seems useful. Maybe only with VERBOSE,
similarly to the extended per-worker information.

--
David Geier

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Roman Khapov 2025-11-18 18:46:15 Re: *_LAST in enums to define NUM* macross
Previous Message Dmitry Dolgov 2025-11-18 18:30:36 Re: pg_utility ?