Re: Optimizing PostgreSQL Queries

From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: Gowthamie Balamurugan <gowthamiebalamurugan(at)gmail(dot)com>
Cc: "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Optimizing PostgreSQL Queries
Date: 2012-03-20 11:03:13
Message-ID: CANeAVBk=nrds-+m_=1==HU7knEMQztB=fXH2hcwp69=PvHo8Tw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Tue, Mar 20, 2012 at 1:55 AM, Gowthamie Balamurugan <
gowthamiebalamurugan(at)gmail(dot)com> wrote:

> Dear All,
>
> I have two tables and i have to query my postgresql database.the table 1
> has about 140 million records and table 2 has around 50 million records of
> the following.
>
> the table 1 has the following structure:
>
> tr_id bigint NOT NULL, # this is the primary key
>
>
> query_id numeric(20,0), # indexed column
>
>
> descrip_id numeric(20,0) # indexed column
>
>
Do you really need numeric, or will an integer do the trick? Integers are
generally going to be faster for most operations.

> and table 2 has the following structure
>
> query_pk bigint # this is the primary key
>
>
> query_id numeric(20,0) # indexed column
>
>
> query_token numeric(20,0)
>
> The sample db of table1 would be
>
> 1 25 96
> 2 28 97
>
> 3 27 98
> 4 26 99
>
> The sample db of table2 would be
>
> 1 25 9554
> 2 25 9456
>
> 3 25 9785
> 4 25 9514
>
> 5 26 7412
> 6 26 7433
>
> 7 27 545
> 8 27 5789
>
> 9 27 1566
> 10 28 122 11 28 1456
>
> I am preferring queries in which i would be able to query in blocks of
> tr_id. In range of 10,000 as this is my requirement.
>
> I would like to get output in the following manner
>
> 25 {9554,9456,9785,9514}
>
> 26 {7412,7433}
> 27 {545,5789,1566}
>
> 28 {122,1456}
>
> I tried in the following manner
>
> select query_id, array_agg(query_token) from sch.table2 where query_id in(select query_id from sch.table1 where tr_id between 90001 and 100000) group by query_id
>
>
Have you tried a join instead of a subselect? Also, having explain output
can be helpful.

>
> I am performing the following query which takes about 121346 ms and when
> some 4 such queries are fired it still takes longer time.can you please
> help me to optimise the same.
>
> I have a machine which runs on windows 7 with i7 2nd gen proc with 8GB of
> RAM.
>
> The following is my postgresql configuration
> shared_buffers = 1GB
> effective_cache_size = 5000MB
> work_mem = 2000MB
>
> What should i do to optimize it.
>
> Thanks
>
> EDIT : it would be great if the results ordered according to the following
> format
> 25 {9554,9456,9785,9514}
> 28 {122,1456}
> 27 {545,5789,1566}
> 26 {7412,7433}
>
> ie according to the order of the queryid present in table1 ordered by
> tr_id.If this is computationally expensive may be in the client code i
> would try to optimize it.But i am not sure how efficient it would be.
>
> --
> Thanks
> Gowthamie Balamurugan
>
>

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Cleaveland 2012-03-20 20:54:27 SYSID mismatch
Previous Message Vincenzo Melandri 2012-03-20 09:37:27 Hi all guys!