Re: ordered aggregates using WITHIN GROUP (was Re: can somebody execute this query on Oracle 11.2g and send result?)

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ordered aggregates using WITHIN GROUP (was Re: can somebody execute this query on Oracle 11.2g and send result?)
Date: 2010-01-31 04:47:46
Message-ID: e08cc0401001302047i4767abc2h25db1811685dc631@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2010/1/30 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> "Jonah H. Harris" <jonah(dot)harris(at)gmail(dot)com> writes:
>>> http://download.oracle.com/docs/cd/E11882_01/server.112/e10592/functions087.htm
>>>
>>> Defines:
>>>
>>> *LISTAGG* (measure_expr [, 'delimiter_expr'])
>>> *WITHIN GROUP* (order_by_clause) [*OVER* query_partition_clause]
>
> Hmph.  I don't know what would possess them to model their function on
> the rank-function syntax extension rather than ARRAY_AGG.  The latter
> seems a lot closer to the functionality that's actually needed.  I'm
> still trying to wrap my brain around what the spec says about the
> rank-function syntax, but it's notable that the order-by clause is
> tightly tied to the aggregate input value(s) --- the sort expressions
> have to have the same number and types as the inputs.  Which is
> certainly not very sensible for listagg.
>
> Can anyone figure out exactly what SQL:2008 10.9 rule 6 is actually saying?
> The references to VE1..VEk in the scalar subquery seem to me to be
> semantically invalid.  They would be sensible if this were a window
> function, but it's an aggregate, so I don't understand what row they'd
> be evaluated with respect to.

As far as I know <hypothetical set function> is used to do "what-if"
analysis. rank(val1) within group (order by sk1) chooses the rank
value so that val1 is equivalent to or just greater than sk1 when you
calculate rank() over (partition by group order by sk1) within the
group. So this is actually an aggregate and in 10.9 rule 6 it extracts
only one row from all results of rank() (WHERE MARKER = 1) which is
calculated with all rows within the group + argument value list.
Again, the argument of this kind of functions should be constant
during aggregate (at least it looks like so to me).

SELECT salary FROM emp;
salary
--------
300
500
700

SELECT rank(530) WITHIN GROUP(ORDER BY salary),
rank(200) WITHIN GROUP(ORDER BY salary) FROM emp;
rank | rank
------+------
3 | 1

Googling web, there's been the syntax in Oracle for some time. So I'd
bet Oracle crews hated to invent new syntax for listagg() because
ordered aggregate can be represented by *existing* WITHIN GROUP syntax
although the spec distinguish them. I don't think we should change
ordered aggregate syntax we have just introduced, but one of choices
is to support both of them. In other words, the queries can be the
same:

SELECT array_agg(val ORDER BY sk) FROM ...
SELECT array_agg(val) WITHIN GROUP (ORDER BY sk) FROM ...

P.S. I don't have Oracle to try with so I misunderstood something.

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-01-31 05:27:27 Re: further explain changes
Previous Message Euler Taveira de Oliveira 2010-01-31 04:44:13 Re: development setup and libdir