From: | Dennis Runz <d(dot)runz(at)stud(dot)uni-heidelberg(dot)de> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | 2D array aggregation performance (array_agg for arrays) |
Date: | 2012-01-16 17:31:34 |
Message-ID: | CALB1XpLXq9tKvGPS158KQ7c6rXwjgr8_56G8hCCbN3kU8h=xXA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello Community,
I am working on a database extension for PostgreSQL (8.4+) to support
functions for spectral graph theory of spatial/geometric graphs like
proteins. For this purpose we need to store and use huge multidimensional
arrays in the database (adjacency matrix for graph).
The performance critical function here is the aggregation of
one-dimensional arrays into two-dimensional arrays,
e.g. {1,2} and {3,4} => {{1,2},{3,4}}, respectively a set of arrays into an
array of arrays.
The array_agg function performs well, but only supports aggregation of
element types into arrays. For performance reasons, we need a similar
function that is able to aggregate arrays as shown above. Other functions
like array_cat reallocate the arrays after each aggregation step which
doesn't scale.
Now I am trying to implement array_agg for array of array aggregation using
array_agg_transfn (-> hd_array_transfn) and array_agg_finalfn (->
hd_array_finalfn) from Postgres 9.1 sources as a starting point.
This is what the current code looks like:
https://gist.github.com/5b2b60a939bec8410382
I assume it is not sufficient to simply adapt the finalfunction to create a
2D array? I tried this but Postgres crashes in:
(gdb) bt
#0 pg_detoast_datum (datum=0x0) at fmgr.c:2233
#1 0x00ab9303 in construct_md_array (elems=0x220ffbb0, nulls=0x220ffcb8
"", ndims=2, dims=0xbf84c694, lbs=0xbf84c69c, elmtype=1007, elmlen=-1,
elmbyval=0 '\000', elmalign=105 'i') at arrayfuncs.c:2936
#2 0x00ac0052 in makeMdArrayResult (astate=0x220ffb88, ndims=2,
dims=0xbf84c694, lbs=0xbf84c69c, rcontext=0x220d8aa8, release=0 '\000') at
arrayfuncs.c:4665
#3 0x0056c9d1 in hd_array_finalfn () from
/usr/lib/postgresql/9.1/lib/hd_array.so
#4 0x009c4ffa in finalize_aggregate (aggstate=<optimized out>,
peraggstate=0x220f9d58, pergroupstate=0x220f9e60, resultVal=0x220f9d38,
resultIsNull=0x220f9d48 "") at nodeAgg.c:758
# ...
I am a novice to Postgres internals and Postgres programming and would
greatly appreciate if anyone could help me with this implementation problem.
We are using PostgreSQL 9.1, but the aggregate should also run on 8.4 at
the end.
Best Regards,
Dennis
From | Date | Subject | |
---|---|---|---|
Next Message | salah jubeh | 2012-01-16 17:44:16 | Re: psql - TYPE DEFINITION |
Previous Message | Tomas Vondra | 2012-01-16 16:31:48 | Re: Getting all entries in a single block with ctid |