Re: Document aggregate functions better w.r.t. ORDER BY

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Document aggregate functions better w.r.t. ORDER BY
Date: 2023-10-26 00:10:17
Message-ID: CAKFQuwZNrihaCkG_D_7s4hMeAVs4nr9QyTreizF5Dx4NWD=ybw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 25, 2023 at 4:22 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> On Wed, Oct 25, 2023 at 04:14:11PM -0700, David G. Johnston wrote:
> > Yeah, we punt on the entire concept in the data type section:
> >
> > "Managing these errors and how they propagate through calculations is the
> > subject of an entire branch of mathematics and computer science and will
> not be
> > discussed here," ...
> >
> > Also, I'm now led to believe that the relevant IEEE 754 floating point
> addition
> > is indeed commutative. Given that, I am inclined to simply not add the
> order
> > by clause at all to those four functions. (actually, you already got rid
> of the
> > avg()s but the sum()s are still present, so just those two).
>
> Ah, yes, sum() removed. Updated patch attached.
>
>
The paragraph leading into the last added example needs to be tweaked:

If DISTINCT is specified within an aggregate, the data is sorted in
ascending order while extracting unique values. You can add an ORDER BY
clause, limited to expressions matching the regular arguments of the
aggregate, to sort the output in descending order.

(show existing - DISTINCT only - example here)

<programlisting>
WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
SELECT string_agg(DISTINCT v::text, ';' ORDER BY v::text DESC) FROM vals;
string_agg
-----------
4;3;2;1
</programlisting>

(existing note)

Question: Do you know whether we for certain always sort ascending here to
compute the unique values or whether if, say, there is an index on the
column in descending order (or ascending and traversed backwards) that the
data within the aggregate could, with an order by, be returned in
descending order? If it is ascending, is that part of the SQL Standard
(since it doesn't even allow an order by to give the user the ability the
control the output ordering) or does the SQL Standard expect that even a
random order would be fine since there are algorithms that can be used that
do not involve sorting the input?

It seems redundant to first say "regular arguments" then negate it in order
to say "DISTINCT list". Using the positive form with "DISTINCT list"
should get the point across sufficiently and succinctly. It also avoids me
feeling like there should be an example of what happens when you do "sort
on an expression that is not included in the DISTINCT list".

Interestingly:

WITH vals (v,l) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM vals;

ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in
argument list
LINE 2: SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM...

But both expressions in the argument list (el and semicolon) do appear in
the ORDER BY...

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-10-26 00:40:55 Re: Should we represent temp files as unsigned long int instead of signed long int type?
Previous Message Michael Paquier 2023-10-26 00:08:42 Re: pg_stat_statements and "IN" conditions