Skip site navigation (1) Skip section navigation (2)

Re: Postgresql - performance of using array in big database

From: Ondrej Ivanič <ondrej(dot)ivanic(at)gmail(dot)com>
To: roberthanco(at)o2(dot)pl
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgresql - performance of using array in big database
Date: 2012-08-09 00:18:10
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance

On 3 August 2012 19:14,  <roberthanco(at)o2(dot)pl> wrote:
> I want to add to table "Item" a column "a_elements" (array type of big
> integers) Every record would have not more than 50-60 elements in this
> column.
> After that i would create index GIN on this column and typical select should
> look like this:
> select*from item where......and5<@ a_elements;

I would use this.

> I have also second, more classical, option.
> Do not add column a_elements to table item but create table elements with
> two columns:
> id_item
> id_element
> This table would have around 200 mln records.
> I am able to do partitioning on this tables so number of records would
> reduce to 20 mln in table elements and 500 K in table item.

I do not understand how you can 'reduce to 20 mln'. Do you mean per partition?

> The second option select looks like this:
> select item.*
> from item
>     leftjoin elements on(item.id_item=elements.id_item)
> where....
> and5= elements.id_element
> I wonder what option would be better in performance point of view. Is
> postgres able to use many different indexes with index GIN (option 1) in a
> single query ?

Assuming that you partition your tables using id_item. Postgres is not
good with partitions if joins are used. Let's have a query:
select .. from item
left join elements on (item.id_item=elements.id_item)
where id_item = 2

needs to scan all partitions in 'elements' table because planner is
not smart enough to push where condition to join clause i.e. rewrite
query like this (8.4, haven't checked in 9.x releases):

select .. from item
left join elements on (item.id_item=elements.id_item and elements.id_item = 2)
where id_item = 2

In order to use partitioning effectively all you queries need to have
constant expression (id_item = 2) in where/join on columns which are
used for partitioning

Ondrej Ivanic

pgsql-performance by date

Next:From: Stefan KellerDate: 2012-08-09 11:00:18
Subject: Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m
Previous:From: Craig RingerDate: 2012-08-08 06:50:26
Subject: Re: Postgres Upgrade from 8.4 to 9.1

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group