Re: Even when the data is already ordered, MergeAppend still adds a Sort node

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: feichanghong <feichanghong(at)qq(dot)com>
Cc: Zhang Mingli <zmlpostgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Even when the data is already ordered, MergeAppend still adds a Sort node
Date: 2025-07-20 16:03:31
Message-ID: 252844.1753027411@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

feichanghong <feichanghong(at)qq(dot)com> writes:
> Currently, I have not found a better way to rewrite this, except by optimizing
> this scenario from the pg kernel side.

If you're willing to modify your query, you could fake it out by
spelling the subquery's "a = 1" condition in a way that won't produce an
EquivalenceClass. For example,

regression=# explain analyze select a, b from (
(select a, b from t t1 where a > 19000 order by a, b)
union all
(select a, b from t t2 where a >= 1 and a <= 1 and b > 1 order by a, b)
) t order by a, b limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.58..0.63 rows=1 width=8) (actual time=0.070..0.070 rows=1.00 loops=1)
Buffers: shared hit=8 read=1
-> Merge Append (cost=0.58..481.10 rows=11000 width=8) (actual time=0.069..0.069 rows=1.00 loops=1)
Sort Key: t1.a, t1.b
Buffers: shared hit=8 read=1
-> Index Only Scan using t_a_b_idx on t t1 (cost=0.29..29.79 rows=1000 width=8) (actual time=0.027..0.027 rows=1.00 loops=1)
Index Cond: (a > 19000)
Heap Fetches: 0
Index Searches: 1
Buffers: shared hit=2 read=1
-> Index Only Scan using t_a_b_idx on t t2 (cost=0.29..341.30 rows=10000 width=8) (actual time=0.041..0.041 rows=1.00 loops=1)
Index Cond: ((a >= 1) AND (a <= 1) AND (b > 1))
Heap Fetches: 0
Index Searches: 1
Buffers: shared hit=6
Planning:
Buffers: shared hit=6
Planning Time: 0.174 ms
Execution Time: 0.089 ms
(19 rows)

I'd be the first to agree that that's a hack not a nice solution.
But I think getting to something that's not a hack is going to
involve a lot more work than this edge case seems worth. We're
not likely to accept a patch that pessimizes planning within
subqueries on the small chance that that will result in a path
whose apparent sort order matches the needs of the outer query
better. Maybe something could be done inside
convert_subquery_pathkeys, but I suspect we don't really have
enough information at that point to decide what to do.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniil Davydov 2025-07-20 16:43:38 Re: POC: Parallel processing of indexes in autovacuum
Previous Message vignesh C 2025-07-20 14:18:42 Re: Logical Replication of sequences