MULTISET and additional functions for ARRAY

From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: MULTISET and additional functions for ARRAY
Date: 2010-11-11 15:02:52
Message-ID: AANLkTi=yjjEwnUNhNxd-JAL7nooHKexmPedv5tB2H5xp@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Postgres supports ARRAY data types well, but there are some
more array functions in the SQL standard. Also, the standard
has MULTISET data type, that is an unordered array.

It looks easy to support additional array functions. There
might be some confusion to treat multi-dimensional arrays
with them, but we could treat all arrays as one-dimensional
like as unnest().

MULTISET supports are more difficult. We have corresponding
type IDs for each array, but we might not want to add additional
IDs for multiset for each type. Any ideas for the issue?

If we reuse type IDs of arrays for multisets, the multisets would
have some special typmod. For example, typmod = 0 means multiset,
and positive value means array with max cardinality. Note that
the SQL standard doesn't mention about multi-dimensional arrays.
So, we can use typmod = -1 as a free-size and free-dimensional
array for backward compatibility.

If we have troublesome issues to support multiset data types,
I'm thinking to add multiset functions that receives ARRAY
types instead at time first time, because an ARRAY is a
MULTISET by definition. Some of functions for multisets
seems to be useful for arrays, too.

Comments and suggestions welcome.

=== Array functions ===
- [FUNCTION] cardinality(anyarray) => integer
- [FUNCTION] trim_array(anyarray, nTrimmed integer) => anyarray

=== Multiset functions ===
- [FUNCTION] cardinality(anymultiset) => integer
- [FUNCTION] element(anymultiset) => anyelement
- [FUNCTION] multiset_member_of(anymultiset, anyelement) => boolean
[SYNTAX] $2 MEMBER OF $1
- [FUNCTION] multiset_is_a_set(anymultiset) => boolean
[SYNTAX] $1 IS A SET
- [FUNCTION] multiset_sub_multiset_of(anymultiset, anymultiset) => boolean
[SYNTAX] $2 SUB MULTISET OF $1
- [FUNCTION] multiset_union[_all](anymultiset, anymultiset) => anymultiset
[SYNTAX] $1 MULTISET UNION [ALL | DISTINCT] $2
- [FUNCTION] multiset_intersect[_all](anymultiset, anymultiset) => anymultiset
[SYNTAX] $1 MULTISET INTERSECT [ALL | DISTINCT] $2
- [FUNCTION] multiset_except[_all](anymultiset, anymultiset) => anymultiset
[SYNTAX] $1 MULTISET EXCEPT [ALL | DISTINCT] $2
- [AGGREGATE] collect(anyelement) => anymultiset
- [AGGREGATE] fusion(anymultiset) => anymultiset
- [AGGREGATE] intersection(anymultiset) => anymultiset

See also secondary sources.
http://waelchatila.com/2005/05/18/1116485743467.html
http://farrago.sourceforge.net/design/CollectionTypes.html
http://publib.boulder.ibm.com/infocenter/db2luw/v9r8/index.jsp?topic=/com.ibm.db2.luw.apdv.sqlpl.doc/doc/t0053486.html
http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/conditions006.htm
http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/operators006.htm

--
Itagaki Takahiro

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yeb Havinga 2010-11-11 15:10:12 Re: BUG #5748: Invalid oidvector data during binary recv
Previous Message Tom Lane 2010-11-11 14:45:30 Re: Exposing an installation's default value of unix_socket_directory