Quick Links

Re: Even when the data is already ordered, MergeAppend still adds a Sort node

From:	feichanghong <feichanghong(at)qq(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Zhang Mingli <zmlpostgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Even when the data is already ordered, MergeAppend still adds a Sort node
Date:	2025-07-21 01:58:09
Message-ID:	tencent_84733F884CFE3830104A94BCDF1132C83A09@qq.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> On Jul 21, 2025, at 00:03, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> feichanghong <feichanghong(at)qq(dot)com> writes:
>> Currently, I have not found a better way to rewrite this, except by optimizing
>> this scenario from the pg kernel side.
>
> If you're willing to modify your query, you could fake it out by
> spelling the subquery's "a = 1" condition in a way that won't produce an
> EquivalenceClass. For example,
>
> regression=# explain analyze select a, b from (
> (select a, b from t t1 where a > 19000 order by a, b)
> union all
> (select a, b from t t2 where a >= 1 and a <= 1 and b > 1 order by a, b)
> ) t order by a, b limit 1;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.58..0.63 rows=1 width=8) (actual time=0.070..0.070 rows=1.00 loops=1)
> Buffers: shared hit=8 read=1
> -> Merge Append (cost=0.58..481.10 rows=11000 width=8) (actual time=0.069..0.069 rows=1.00 loops=1)
> Sort Key: t1.a, t1.b
> Buffers: shared hit=8 read=1
> -> Index Only Scan using t_a_b_idx on t t1 (cost=0.29..29.79 rows=1000 width=8) (actual time=0.027..0.027 rows=1.00 loops=1)
> Index Cond: (a > 19000)
> Heap Fetches: 0
> Index Searches: 1
> Buffers: shared hit=2 read=1
> -> Index Only Scan using t_a_b_idx on t t2 (cost=0.29..341.30 rows=10000 width=8) (actual time=0.041..0.041 rows=1.00 loops=1)
> Index Cond: ((a >= 1) AND (a <= 1) AND (b > 1))
> Heap Fetches: 0
> Index Searches: 1
> Buffers: shared hit=6
> Planning:
> Buffers: shared hit=6
> Planning Time: 0.174 ms
> Execution Time: 0.089 ms
> (19 rows)

Thank you very much for your suggestion, this method is effective for my
scenario. I will modify my SQL accordingly.

> I'd be the first to agree that that's a hack not a nice solution.
> But I think getting to something that's not a hack is going to
> involve a lot more work than this edge case seems worth. We're
> not likely to accept a patch that pessimizes planning within
> subqueries on the small chance that that will result in a path
> whose apparent sort order matches the needs of the outer query
> better. Maybe something could be done inside
> convert_subquery_pathkeys, but I suspect we don't really have
> enough information at that point to decide what to do.

Yes, if we retain the EC_MUST_BE_REDUNDANT pathkey in the subquery, it may lead
to a less optimal plan. For example, consider the following case: the subquery
is sorted by "a,b", and the top-level query is sorted by "b". In this
situation, the subquery's pathkey "b" matches the top-level query. However, if
we retain the subquery's pathkey as "a,b", it does not match the top-level
query, resulting in the need for an additional sort operation.
```sql
-- subquery pathkey is "b"
postgres=# explain select * from (select * from t where a = 1 and b > 1 order by a, b) t order by b limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------
Limit (cost=0.29..0.33 rows=1 width=8)
-> Index Only Scan using t_a_b_idx on t (cost=0.29..316.27 rows=9999 width=8)
Index Cond: ((a = 1) AND (b > 1))
(3 rows)

-- subquery pathkey is "a,b"
postgres=# explain select * from (select * from t where a = 1 and b > 1 order by a, b) t order by b limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Limit (cost=1155.56..1155.56 rows=1 width=8)
-> Sort (cost=1155.56..1180.56 rows=9999 width=8)
Sort Key: t.b
-> Sort (cost=980.58..1005.58 rows=9999 width=8)
Sort Key: t.a, t.b
-> Index Only Scan using t_a_b_idx on t (cost=0.29..316.27 rows=9999 width=8)
Index Cond: ((a = 1) AND (b > 1))
(7 rows)
```

A possible solution might be: we retain the EC_MUST_BE_REDUNDANT pathkey within
the subquery, but when executing create_sort_plan, pathkeys_contained_in, and
convert_subquery_pathkeys, we need to take into account pathkeys with
ec_has_const. Naturally, the actual situation could be much more complex.

Best Regards,
Fei Changhong

In response to

Re: Even when the data is already ordered, MergeAppend still adds a Sort node at 2025-07-20 16:03:31 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	jian he	2025-07-21 03:15:36	Re: array_random
Previous Message	Fujii Masao	2025-07-21 01:14:43	Re: Log prefix missing for subscriber log messages received from publisher