Re: very large tables

From: Otandeka Simon Peter <sotandeka(at)gmail(dot)com>
To: Ramiro Diaz Trepat <ramiro(at)diaztrepat(dot)name>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: very large tables
Date: 2009-05-25 09:15:57
Message-ID: efc321cd0905250215r571c6e9eo59664f5da816ade8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Try partitioning. It should sort you out.

-Peter-

On 5/25/09, Ramiro Diaz Trepat <ramiro(at)diaztrepat(dot)name> wrote:
>
> Hello list,I will try to make this as brief as possible.
> I have a brother who is a scientist studding atmospheric problems. He was
> trying to handle all of his data with flat files and MatLab, when I stepped
> in and said, wait, you need a RDBMS to handle all this data.
> So, he basically has 2 very simple tables, one describing an atmosphere
> pixel (latitude, longitude and height) and a couple of other simple values.
> The other table has values of different variables meassured at each pixel.
> The table with the atmosphere pixels, currently has about 140MM records,
> and the one the values about 1000MM records. They should both grow to
> about twice this size.
> I have not started yet to deal with the largest table, I am only doing test
> with the smaller one (140MM rows), and they much slower than what we were
> expecting.
> A simple query with no joins like this:
>
> select trajectory from atmospherepoint where moment='1979-05-02 11:45:00'
> and height >= 12000 and height <= 14000 and longitude >= 180 and longitude
> <= 190 and latitude >= 0 and latitude <= 10;
>
> is taking almost 4 minutes in a decent multi core server. Also, the moment
> equality test should yield no more than 2MM rows, so I thought that should
> make things a lot faster.
>
> The explain returns the suspicious "Seq Scan" up front:
>
> Seq Scan on atmospherepoint (cost=0.00..5461788.08 rows=134 width=8)
> Filter: ((moment = '1979-05-02 11:45:00'::timestamp without time zone)
> AND (height >= 12000::double precision) AND (height <= 14000::double
> precision) AND (longitude >= 180::double precision) AND (longitude <=
> 190::double precision) AND (latitude >= 0::double precision) AND (latitude
> <= 10::double precision))
>
> but I have created indices for every column in the table and I have also
> runned ANALIZE and VACUUM:
>
> Table "public.atmospherepoint2"
> Column | Type | Modifiers
> ------------+-----------------------------+-----------
> id | integer | not null
> trajectory | integer | not null
> moment | timestamp without time zone | not null
> longitude | real | not null
> latitude | real | not null
> height | real | not null
> Indexes:
> "atmospherepoint2_pkey" PRIMARY KEY, btree (id)
> "atm_height_idx" btree (height)
> "atm_latitude_idx" btree (latitude)
> "atm_longitude_idx" btree (longitude)
> "atm_moment_idx" btree (moment)
> "atm_trajectory_idx" btree (trajectory)
> Foreign-key constraints:
> "atmospherepoint2_trajectory_fkey" FOREIGN KEY (trajectory) REFERENCES
> trajectory2(id)
>
>
> I will welcome a lot any advice or pointer on how to tune these tables to
> work faster.
> The tables don't change at all once the data has been loaded, they are only
> used for queries.
> Thank you very much.
>
>
> r.
>
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Matthew Brand 2009-05-25 09:17:56 Re: very large tables
Previous Message Ramiro Diaz Trepat 2009-05-25 08:58:54 very large tables