Re: [GENERAL] arrays

From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: Mike Sosteric <mikes(at)athabascau(dot)ca>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: [GENERAL] arrays
Date: 2002-09-30 15:54:31
Message-ID: web-1734564@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers pgsql-sql

Mike,

> We are currently developing a database to host some complicated, XMl
> layered data. We have chosen postgres because of its ability to store
> multidimensional arrays. We feel that using these will allow us to
> simplify the database structure considerably by storing some data in
> multidimensional arrays.

Hmmm ... I'm curious; what kind of data do you feel could be
*simplified* by multi-dimensional arrays?

> However, we currently have some dissenters who believe that using the
> multidimensional arrays will make queries slower and unneccesarily
> complicated.

They're correct, especially about the latter.

> 1) are SQL queries slower when extracting data from multidimensional
> arrays

Yes, but this is fixable; see the Intarray package in /contrib.

> 2) are table joins more difficult or unneccesarily complicated

Yes.

> 3) can you do selects on only a portion of a multidimensional array.

Yes.

> That
> is, if you were storing multilanguage titles in a two dimensional
> array,
>
> [en], "english title"
> [fr], "french title"
>
> could you select where title[0] = 'en'

Yes.

> I know these may sound like terribily stupid questions. but we need
> some
> quick guidance before proceeding with a schema that relies on these
> advanced data features of postgres

The problem you will be facing is that Arrays are one of the
fundamentally *Non-Relational* features that Postgresql supports for a
limited set of specialized purposes (mostly buffer tables, procedures,
and porting from MySQL). As such, incorporating arrays into any kind
of complex schema will drive you to drink ... and is 95% likely more
easily done through tables and sub-tables, in any case.

Let's take your example of "title", and say we wanted to use it in a
join:

SELECT movie.name, movie.show_date, movie.title_lang, title.translation
FROM movies JOIN title_langs ON (
movie.title_lang[1] = title_langs.lang OR movie.title_lang[2] =
title_langs.lang OR movie.title_lang[3] = title_langs.lang ... )

... as you can see, the join is extremely painful. Let alone
constructing a query like "Select all movies with titles only in
English and French and one other language." (try it, really)

Then there's the not insignificant annoyance of getting data into and
out of multi-dimensional arrays, which must constantly be parsed into
text strings. And the fact that you will have to keep track, in your
middleware code, of what the ordinal numbers of arrays mean, since
array elements are fundamentally ordered. (BTW, Postgres arrays begin
at 1, not 0)

Now, I know at least one person who is using arrays to store scientific
data. However, that data arrives in his lab in the form of matrices,
and is not used for joins or query criteria beyond a simple "where"
clause.

As such, I'd reccommend one of two approaches for you:

1) Post some of your schema ideas here, and let us show you how they
are better done relationally. The relational data model has 30 years
of thought behind it -- it can solve a lot of problems.

2) Shift over to an XML database or a full-blown OODB (like Cache').

Good luck.

-Josh Berkus

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dan Langille 2002-09-30 15:59:12 Re: [GENERAL] arrays
Previous Message Andy Kriger 2002-09-30 15:51:14 how do i find out how long a query took?

Browse pgsql-hackers by date

  From Date Subject
Next Message Dan Langille 2002-09-30 15:59:12 Re: [GENERAL] arrays
Previous Message Justin Clift 2002-09-30 15:40:49 Re: 7.2.3 fixes (was Re: Cause of missing pg_clog files)

Browse pgsql-sql by date

  From Date Subject
Next Message Dan Langille 2002-09-30 15:59:12 Re: [GENERAL] arrays
Previous Message Tom Lane 2002-09-30 14:42:08 Re: [SQL] arrays