Re: Add proper planner support for ORDER BY / DISTINCT aggregates

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Date: 2021-07-21 02:52:43
Message-ID: CAApHDvogabtJzP8K6s75ruSXJAOfCvUJtUVWBS=2C-49kRqvCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 19 Jul 2021 at 18:32, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> wrote:
> regression=# explain select sum(unique1 order by ten, two), sum(unique1 order
> by four), sum(unique1 order by two, four) from tenk2 group by ten;
> QUERY PLAN
> ------------------------------------------------------------------------
> GroupAggregate (cost=1109.39..1234.49 rows=10 width=28)
> Group Key: ten
> -> Sort (cost=1109.39..1134.39 rows=10000 width=16)
> Sort Key: ten, ten, two
> -> Seq Scan on tenk2 (cost=0.00..445.00 rows=10000 width=16)
>
>
> We would ideally like to sort on ten, two, four to satisfy the first and last
> aggref at once. Stripping the common prefix (ten) would eliminate this problem.

Thanks for finding this. I've made a few changes to make this case
work as you describe. Please see attached v6 patches.

Because I had to add additional code to take the GROUP BY pathkeys
into account when choosing the best ORDER BY agg pathkeys, the
function that does that became a little bigger. To try to reduce the
complexity of it, I got rid of all the processing from the initial
loop and instead it now uses the logic from the 2nd loop to find the
best pathkeys. The reason I'd not done that in the first place was
because I'd thought I could get away without building an additional
Bitmapset for simple cases, but that's probably fairly cheap compared
to building Pathkeys. With the additional complexity for the GROUP
BY pathkeys, the extra code seemed not worth it.

The 0001 patch is the ORDER BY aggregate code. 0002 is to fix up some
broken regression tests in postgres_fdw that 0001 caused. It appears
that 0001 uncovered a bug in the postgres_fdw code. I've reported
that in [1]. If that turns out to be a bug then it'll need to be fixed
before this can progress.

David

[1] https://www.postgresql.org/message-id/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com

Attachment Content-Type Size
v6-0001-Add-planner-support-for-ORDER-BY-aggregates.patch application/octet-stream 28.5 KB
v6-0002-Make-the-postgres_fdw-test-pass.patch application/octet-stream 5.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-07-21 04:21:17 Re: Addition of authenticated ID to pg_stat_activity
Previous Message Bruce Momjian 2021-07-21 02:48:34 Re: Have I found an interval arithmetic bug?