proposal: array utility functions phase 1

From: Joe Conway <mail(at)joeconway(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: proposal: array utility functions phase 1
Date: 2002-12-08 16:49:00
Message-ID: 3DF377FC.8050303@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

I'm working on the TODO item "Allow easy display of usernames in a group" in
the context of a slightly larger effort to improve usability of arrays. I'm
far enough down the road to have a better idea of where I want to go with
this, but I'd like to vet those ideas with the list so I don't waste too much
effort if everyone hates them ;-)

The first function borrows from an idea Nigel Andrews had -- i.e. expand an
array into rows (and possibly columns). It currently works like this:

-- 1D array
test=# select * from array_values('{101,102,103,104}'::int[]) as (a int, b1 int);
a | b1
---+-----
1 | 101
2 | 102
3 | 103
4 | 104
(4 rows)

CREATE TABLE arr_text(f1 int, f2 text[]);
INSERT INTO arr_text VALUES (1, '{"a","b","c"}');
UPDATE arr_text SET f2[-2:0] = '{"x","y","z"}' WHERE f1 = 1;
CREATE OR REPLACE FUNCTION get_arr_text(int) RETURNS text[] AS 'SELECT f2 FROM
arr_text WHERE f1 = $1' LANGUAGE 'sql';

test=# select * from array_values(get_arr_text(1)) as (a int, b1 text);
a | b1
----+----
-2 | x
-1 | y
0 | z
1 | a
2 | b
3 | c
(6 rows)

-- 2D array
test=# select * from array_values('{{1,2,3,4},{5,6,7,8}}'::int[]) as (a int,
b1 int, b2 int, b3 int, b4 int);
a | b1 | b2 | b3 | b4
---+----+----+----+----
1 | 1 | 2 | 3 | 4
2 | 5 | 6 | 7 | 8
(2 rows)

It accepts type anyarray, and returns record. The first column preserves the
array subscript for the 1st dimension.

One question I have is this: what, if anything, should be done with 3 (and
higher) dimension arrays? I was considering returning 2 columns -- the 1st
dimension array subscript, and a 2nd column containing the sub-array left
over. E.g.:

array_values('{{{111,112},{121,122}},{{211,212},{221,222}}}'::int[]) would become:

a | b1
----+-----------------------
1 | {{111,112},{121,122}}
2 | {{211,212},{221,222}}

Does this make sense, or is something else better, or would it be better not
to support 3 dim arrays and up?

Now on to the TODO item. Given the array_values() function, here's what I was
thinking of to implement listing members of a group:

CREATE OR REPLACE FUNCTION pg_get_grolist(text) RETURNS INT[] AS 'SELECT
grolist FROM pg_group WHERE groname = $1' LANGUAGE 'sql';

CREATE TYPE pg_grolist_rec AS (array_index int, member_name text);

CREATE OR REPLACE FUNCTION group_list(text) RETURNS SETOF pg_grolist_rec AS
'SELECT g.id, pg_get_userbyid(g.usesysid)::text AS member_name FROM
array_values(pg_get_grolist($1)) AS g(id int, usesysid int)' LANGUAGE 'sql';

test=# select * from pg_group;
groname | grosysid | grolist
---------+----------+---------------
g1 | 100 | {100,101}
g2 | 101 | {100,101,102}
(2 rows)

test=# select * from group_list('g2');
array_index | member_name
-------------+-------------
1 | user1
2 | user2
3 | user3

pg_get_grolist(text) is intended for internal use, as is the pg_grolist_rec
composite type. group_list() is intended as the user facing table function. I
would implement this by running the three statements above during initdb.

Any comments or objections WRT object names or the method of implementation? I
don't think this is a very speed critical application, but even using the sql
functions it is very fast:
test=# explain analyze select * from group_list('g2');
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Function Scan on group_list (cost=0.00..12.50 rows=1000 width=36) (actual
time=1.49..1.50 rows=3 loops=1)
Total runtime: 1.55 msec
(2 rows)

I have more planned beyond the above as outlined in an earlier post (see
http://archives.postgresql.org/pgsql-hackers/2002-11/msg01213.php).

Next on my list will be a split() function (as discussed in early September)
that creates an array from an input string by splitting on a given delimiter.
This is similar to functions in perl, php, and undoubtedly other languages. It
should work nicely in conjunction with array_values().

Sorry for the long mail and thanks for any feedback!

Joe

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeroen T. Vermeulen 2002-12-08 18:28:03 More on cursors in 7.3
Previous Message Magnus Naeslund(f) 2002-12-08 15:44:02 Re: [GENERAL] 7.3 txt2txtidx -> crash

Browse pgsql-patches by date

  From Date Subject
Next Message Neil Conway 2002-12-08 21:57:31 minor doc improvements
Previous Message Rod Taylor 2002-12-08 14:52:55 ALTER TABLE .. SET WITH / WITHOUT OIDS