Re: Difference between array column type and separate table

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
Cc: Mike Christensen <mike(at)kitchenpc(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Difference between array column type and separate table
Date: 2009-05-03 16:19:43
Message-ID: 49FDC41F.6030202@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Alban Hertroys wrote:
> On May 2, 2009, at 9:33 AM, Mike Christensen wrote:
>
>> ...
>> create table Threads ( ... Tags int2[], ...);
>>
>> To me this seems cleaner, but I'm wondering about performance. If I
>> had millions of threads, is a JOIN going to be faster? ...
>
> ...I don't think array values are indexable either. ...

Of course they are indexable if you construct the right index
http://www.postgresql.org/docs/current/static/intarray.html

For certain queries, the array will probably be *much* faster
than the extra tables. For example a query like this:

SELECT * FROM Threads WHERE Threads.tags @@ '1&(2|3)'::query_int;

that does a single indexscan will be far far faster using an
extra table, where the query would look something like:

select * from Threads T where
id in (select threadid from threadtags where id = 1)
and
id in (select threadid from threadtags where id = 2 or id = 3);

requiring 3 indexscans on a table with many more rows, and
some potentially rather expensive joins.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message justin 2009-05-03 16:43:58 Re: keeping track of function execution
Previous Message David Rowley 2009-05-03 16:17:16 Re: keeping track of function execution