Re: Consider explicit incremental sort for Append and MergeAppend

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Consider explicit incremental sort for Append and MergeAppend
Date: 2025-05-15 13:03:36
Message-ID: 7f080758-8cc2-49a4-8968-5a4cde505e72@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/5/2025 11:29, Richard Guo wrote:
> For ordered Append or MergeAppend, it seems that incremental sort is
> currently not considered when injecting an explicit sort into subpaths
> that are not sufficiently ordered. For instance:
Thanks for doing this job.
I have reviewed your patch and want to put here some thoughts:
0. The patch looks simple enough to be safe. I passed through the code
and found no issues except comments (see thought No.1). I will be okay
if you commit it.
1. I'm not very happy with the fact that it strengthens the cost_append
connection with create_append_plan. At least, there should be
cross-reference comments to let developers know if they change something
inside one of these functions.
2. IncrementalSort is not always more effective - it depends on the
column's number of groups. In my experience, a non-cost-based decision
one day meets the problematic case, and the people who stick with it are
much more confused than in the case when planner decision connected to
the costings - they trust the cost model or the cost model tuned by GUCs.
3. The functions label_incrementalsort_with_costsize and
label_sort_with_costsize are not ideal architectural decisions.
Attempting to improve sort / incremental sort cost functions, I am
always stuck in the absence of some necessary data from the sorting path
and RelOptInfo at this stage.

As an alternative, you may check the approach of [1], where we decide
how to adjust a subpath to MergeAppend needs inside
generate_orderedappend_paths using a cost-based approach.

Also, would you have a chance to look into the [1,2]? It seems like a
further improvement, bringing a bit closer optimality of appended path
choice to single-table scan choice.

[1]
https://www.postgresql.org/message-id/flat/25d6a2cd161673d51373b7e07e6d9dd6%40postgrespro.ru
[2]
https://www.postgresql.org/message-id/f0206ef2-6b5a-4d07-8770-cfa7cd30f685@gmail.com

--
regards, Andrei Lepikhov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Panin 2025-05-15 13:26:43 Binary operators for cubes
Previous Message Pavel Seleznev 2025-05-15 12:54:32 Re: Update LDAP Protocol in fe-connect.c to v3