Re: array_accum aggregate

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: array_accum aggregate
Date: 2006-10-09 16:55:47
Message-ID: 20061009165547.GS24675@kenobi.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > I was hoping to do that, but since it's an aggregate the ffunc format is
> > pre-defined to require accepting the 'internal state' and nothing else,
> > and to return 'anyelement' or 'anyarray' one of the inputs must be an
> > 'anyelement' or 'anyarray', aiui.
>
> Hmm ... I hadn't been thinking about what the state type would need to
> be, but certainly "bytea" is a lie given what you're really doing.

Indeed. I've updated the functions quite a bit to clean things up,
including: Added many more comments, removed the unnecessary 'storage*'
pointer being used, created my own structure for tracking state
information, created a seperate memory context (tied to the AggContext),
correctly handle NULL values, and changed the ffunc to use
makeArrayResult.

I also tried just tried using polymorphic types for the functions and
for the aggregate and it appeared to just work:

create function aaccum_sfunc (anyarray, anyelement) returns anyarray
language 'C' AS 'aaccum.so', 'aaccum_sfunc'
;
create function aaccum_ffunc (anyarray) returns anyarray language
'C' AS '/data/sfrost/postgres/arrays/aaccum.so', 'aaccum_ffunc'
;
create aggregate aaccum (
sfunc = aaccum_sfunc,
basetype = anyelement,
stype = anyarray,
finalfunc = aaccum_ffunc
);

select aaccum(generate_series) from generate_series(1,5);
aaccum
-------------
{1,2,3,4,5}
(1 row)

(test is a table with one varchar column, abc)
select aaccum(abc) from test;
aaccum
---------
{a,b,c}
(1 row)

(Added a column called 'hi', set to 'a', added b,b and c,b)
select hi,aaccum(abc) from test group by hi;
hi | aaccum
----+---------
b | {b,c}
a | {a,b,c}
(2 rows)

It makes some sense that it would work as an 'anyarray' is just a
variable-length type internally and so long as nothing else attempts to
make sense out of our 'fake array' everything should be fine.

The latest version also appears to be a bit faster than the prior
version. I'm going to be running a very large query shortly using
this aaccum and will report back how it goes. Please let me know if
there are any other improvments or changes I should make. I'd like to
submit this to -patches w/ the appropriate entries to have it be
included in the core distribution. Is it acceptable to reuse the
'array_accum' name even though it was used in the documentation as an
example? I'm thinking yes, but wanted to check.

Thanks!

Stephen

Attachment Content-Type Size
aaccum.c text/x-csrc 3.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2006-10-09 17:07:29 Re: pg_dump exclusion switches and functions/types
Previous Message Jim C. Nasby 2006-10-09 16:54:00 Re: pg_dump exclusion switches and functions/types