From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Document aggregate functions better w.r.t. ORDER BY |
Date: | 2023-10-26 01:57:10 |
Message-ID: | ZTnHdk833s1UG-Vi@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 25, 2023 at 05:10:17PM -0700, David G. Johnston wrote:
> The paragraph leading into the last added example needs to be tweaked:
>
> If DISTINCT is specified within an aggregate, the data is sorted in ascending
> order while extracting unique values. You can add an ORDER BY clause, limited
> to expressions matching the regular arguments of the aggregate, to sort the
> output in descending order.
>
> (show existing - DISTINCT only - example here)
>
> <programlisting>
> WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
> SELECT string_agg(DISTINCT v::text, ';' ORDER BY v::text DESC) FROM vals;
> string_agg
> -----------
> 4;3;2;1
> </programlisting>
>
> (existing note)
I see what you mean. I added an example that doesn't match the existing
paragraph. I have rewritten the paragraph and used a relevant example;
patch attached.
> Question: Do you know whether we for certain always sort ascending here to
> compute the unique values or whether if, say, there is an index on the column
> in descending order (or ascending and traversed backwards) that the data within
> the aggregate could, with an order by, be returned in descending order? If it
> is ascending, is that part of the SQL Standard (since it doesn't even allow an
> order by to give the user the ability the control the output ordering) or does
> the SQL Standard expect that even a random order would be fine since there are
> algorithms that can be used that do not involve sorting the input?
I don't think order is ever guaranteed in the standard without an ORDER
BY.
> It seems redundant to first say "regular arguments" then negate it in order to
> say "DISTINCT list". Using the positive form with "DISTINCT list" should get
> the point across sufficiently and succinctly. It also avoids me feeling like
> there should be an example of what happens when you do "sort on an expression
> that is not included in the DISTINCT list".
Agreed, I rewrote that.
> Interestingly:
>
> WITH vals (v,l) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
> SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM vals;
>
> ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in
> argument list
> LINE 2: SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM...
>
> But both expressions in the argument list (el and semicolon) do appear in the
> ORDER BY...
I think ORDER BY has to match DISTINCT columns, while you are using ';'.
I used a simpler example with array_agg() in my patch to avoid the issue.
--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
Attachment | Content-Type | Size |
---|---|---|
agg_order.diff | text/x-diff | 5.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-10-26 01:59:58 | Re: Introduce a new view for checkpointer related stats |
Previous Message | Daniele Varrazzo | 2023-10-26 01:31:33 | Re: libpq async connection and multiple hosts |