Re: slow group by query

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Ellen Cyran <ellen(at)urban(dot)csuohio(dot)edu>
Cc: <pgsql-sql(at)postgresql(dot)org>
Subject: Re: slow group by query
Date: 2002-11-19 15:04:15
Message-ID: 20021119065715.V65980-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Mon, 18 Nov 2002, Ellen Cyran wrote:

> Is there any way to make this query faster? I have indexes on year,
> msa_code, and sic. I've also tried it with
> an index on the combined group by columns. I've made both sort_mem and
> shared_buffers bigger, but still this query
> takes 40 seconds when I select 4 msa_codes and 7 minutes when I select 40
> msa_codes. Would it just be better
> to transpose the table to begin with and avoid the group by all together?
>
> SELECT b.msa_code, b.sic, b.own, b.ind_div,
> (select emp from tbl_bls_msa as bls
> where bls.year = '1975' and bls.msa_code = b.msa_code and bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div ) AS emp1975,
> (select emp from tbl_bls_msa as bls
> where bls.year = '1976' and bls.msa_code = b.msa_code and bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div ) AS emp1976,
> (select emp from tbl_bls_msa as bls
> where bls.year = '1977' and bls.msa_code = b.msa_code and bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div ) AS emp1977,
> (select emp from tbl_bls_msa as bls
> where bls.year = '1978' and bls.msa_code = b.msa_code and bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div ) AS emp1978,
> (select emp from tbl_bls_msa as bls
> where bls.year = '1990' and bls.msa_code = b.msa_code and bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div) AS emp1990
> FROM tbl_bls_msa AS b
> where msa_code in ('1680','1640','0040','0120','0080')
> GROUP BY b.msa_code, b.sic, b.ind_div, b.own ;

Have you tried doing the subqueries in from? Right now you're running
each subquery once for each output row AFAICS.

Maybe something like (doing only one year for example - and using a
postgres extension), would work...

select distinct on (b.msa_code, b.sic, b.own, b.ind_div)
b.msa_code, b.sic, b.own, b.ind_div, y1975.emp1975 from
tbl_bls_msa as b left outer join (select emp from tbl_bls_mas as bls
where bls.year='1975') y1975 on (b.msa_code=y1975.msa_code and
b.sic=y1975.sic and b.own=y1975.own and b.ind_div=y1975.ind_div)
where msa_code in ('1680', '1640', '0040', '0120', '0080');

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2002-11-19 15:21:26 Re: slow group by query
Previous Message Tomasz Myrta 2002-11-19 14:26:36 Re: