Re: Postgresql - performance of using array in big database

From: Ondrej Ivanič <ondrej(dot)ivanic(at)gmail(dot)com>
To: roberthanco(at)o2(dot)pl
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgresql - performance of using array in big database
Date: 2012-08-09 00:18:10
Message-ID: CAM6mieJN_3O9HSJ7uO3gjCQ2LTNwp734hhWgSf3bWEcShfh_xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

On 3 August 2012 19:14, <roberthanco(at)o2(dot)pl> wrote:
> I want to add to table "Item" a column "a_elements" (array type of big
> integers) Every record would have not more than 50-60 elements in this
> column.
> After that i would create index GIN on this column and typical select should
> look like this:
> select*from item where......and5<@ a_elements;

I would use this.

> I have also second, more classical, option.
> Do not add column a_elements to table item but create table elements with
> two columns:
>
> id_item
> id_element
>
> This table would have around 200 mln records.
> I am able to do partitioning on this tables so number of records would
> reduce to 20 mln in table elements and 500 K in table item.

I do not understand how you can 'reduce to 20 mln'. Do you mean per partition?

> The second option select looks like this:
> select item.*
> from item
> leftjoin elements on(item.id_item=elements.id_item)
> where....
> and5= elements.id_element
> I wonder what option would be better in performance point of view. Is
> postgres able to use many different indexes with index GIN (option 1) in a
> single query ?

Assuming that you partition your tables using id_item. Postgres is not
good with partitions if joins are used. Let's have a query:
select .. from item
left join elements on (item.id_item=elements.id_item)
where id_item = 2

needs to scan all partitions in 'elements' table because planner is
not smart enough to push where condition to join clause i.e. rewrite
query like this (8.4, haven't checked in 9.x releases):

select .. from item
left join elements on (item.id_item=elements.id_item and elements.id_item = 2)
where id_item = 2

In order to use partitioning effectively all you queries need to have
constant expression (id_item = 2) in where/join on columns which are
used for partitioning

--
Ondrej Ivanic
(ondrej(dot)ivanic(at)gmail(dot)com)

Browse pgsql-performance by date

  From Date Subject
Next Message Stefan Keller 2012-08-09 11:00:18 Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m
Previous Message Craig Ringer 2012-08-08 06:50:26 Re: Postgres Upgrade from 8.4 to 9.1