Re: RFC: array_agg() per SQL:200n

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: array_agg() per SQL:200n
Date: 2008-03-25 02:58:15
Message-ID: 200803250258.m2P2wF909601@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Add to TODO:

* Add SQL-standard array_agg() and unnest() array functions

http://archives.postgresql.org/pgsql-hackers/2008-01/msg01017.php

---------------------------------------------------------------------------

Neil Conway wrote:
> I recently noticed that SQL:200n[1] defines a new aggregate function,
> array_agg(). The relevant portions of the spec are:
>
> p. 66: "If ARRAY_AGG is specified, then an array value with one element
> formed from the <value expression> evaluated for each row that
> qualifies."
>
> p. 556: <array aggregate function> ::=
> ARRAY_AGG <left paren>
> <value expression> [ ORDER BY <sort specification list> ]
> <right paren>
>
> p. 564 discusses the required behavior. The result of array_agg() is an
> array with one element per input value, sorted according to the optional
> ORDER BY clause. NULL input values are included in the array, and the
> result for an empty group is NULL, not an empty array. Note that per
> page 66, I'd expect array values in the input to array_agg() not to be
> flattened.
>
> I'd like to implement array_agg() for 8.4. In the past, we've talked
> about moving the array_accum() example aggregate into the backend[2].
> Now that there's SQL-standard syntax, that's another reason to do it --
> I think this is clearly useful functionality.
>
> The previous discussion got tied up in how to expose the aggregate's
> transition value to the type system. The problem is that the aggregate
> wants to use a transition value that is not a SQL-level type, to allow
> efficient array append operations. Various solutions were mooted about,
> typically involving a pass-by-val pseudotype used to hold a pointer to
> the C struct holding the transition state.
>
> AFAIK the conclusion reached by the previous thread was that to be type
> safe, you'd need one distinct pseudotype per aggregate function, along
> with some way to let the planner distinguish this class of pseudotypes
> from other types (in order to apply the heuristic that functions like
> these are likely to consume more memory). You could identify this class
> by an additional column in pg_type, but I think we'd need a lot of
> machinery to do this properly (e.g. to allow these types to be created
> via SQL). I wonder if this isn't over-engineering: the simple approach
> originally followed by Stephen Frost was to declare the transition value
> as, say, int8, and just disallow the transition and final functions from
> being called outside an aggregate context. AFAIK this would be safe,
> although of course it is ugly.
>
> To parse the ORDER BY clause, we'd need to special-case array_agg() in
> the grammar, which is a bit unfortunate. Implementation-wise, because
> there is no way to lazily evaluate an array expression, I don't think
> there's much to be gained by using the tuplesort infrastructure -- we'll
> need to materialize the entire array into memory when the final function
> is called anyway. Therefore, a simpler approach might be to just
> accumulate inputs in the transition function as usual, and then qsort()
> them in the final function. We could also have the planner arrange for
> the sort to be skipped if it knows that the input to the aggregate will
> be delivered in a compatible ordering.
>
> Comments welcome.
>
> -Neil
>
> [1] http://www.wiscorp.com/SQLStandards.html ; apparently SQL:200n is
> likely to become SQL:2008 without further changes.
>
> [2] http://archives.postgresql.org/pgsql-hackers/2006-10/msg00362.php
> http://archives.postgresql.org/pgsql-patches/2006-10/msg00059.php
> http://archives.postgresql.org/pgsql-hackers/2006-10/msg00683.php
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2008-03-25 03:02:43 Re: [GSoC] (Is it OK to choose items without % mark in theToDoList) && (is it an acceptable idea to build index on Flash Disk)
Previous Message Bruce Momjian 2008-03-25 02:48:02 Re: pl/pgsql Plan Invalidation and search_path