Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Date: 2013-07-18 21:17:43
Message-ID: 51E85B77.6040300@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew,

> Well, as you probably know, the spec is a whole pile of random
> special-case syntax and any similarities are probably more accidental
> than anything else.

Hah, I didn't realize that our ordered aggregate syntax even *was* spec.

> A major difference is that in agg(x order by y), the values of y are
> not passed to the aggregate function - they serve no purpose other
> than controlling the order of the "x" values. Whereas in WITHIN GROUP,
> the values in the ORDER BY ... clause are in some sense the primary
> input to the aggregate, and the "p" argument is secondary and can't
> vary between rows of the group.
>
> Our implementation does heavily reuse the existing executor mechanics
> for ORDER BY in aggregates, and it also reuses a fair chunk of the
> parser code for it, but there are significant differences.

Well, seems like it would work the same as

agg_func(constx,coly,colz ORDER BY coly, colz)

... which means you could reuse a LOT of the internal plumbing. Or am I
missing something?

Also, what would a CREATE AGGREGATE and state function definition for
custom WITHIN GROUP aggregates look like?

> Any time you want to calculate what the rank, dense_rank or cume_dist
> would be of a specific row within a group without actually adding the
> row to the group, this is how it's done.
>
> I don't have any practical examples to hand, but this beast seems to
> be implemented in at least Oracle and MSSQL so I guess it has uses.

Well, I still can't imagine a practical use for it, at least based on
RANK. I certainly have no objections if you have the code, though.

I'll also point out that mode() requires ordered input as well, so add
that to the set of functions we'll want to eventually support.

One thing I find myself wanting with ordered aggregates is the ability
to exclude NULLs. Thoughts?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-07-18 21:24:02 pgsql: WITH CHECK OPTION support for auto-updatable VIEWs
Previous Message Andrew Gierth 2013-07-18 21:03:47 Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)