Re: small table, huge table, and a join = slow and tough query. cake inside!

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "edfialk" <edfialk(at)gmail(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: small table, huge table, and a join = slow and tough query. cake inside!
Date: 2008-05-29 00:12:06
Message-ID: 87hccinf3d.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"edfialk" <edfialk(at)gmail(dot)com> writes:

> So, what I need is all kinds of things, but to start (pseudo-query):
>
> SELECT small.fips, small.geom, small.name, SUM(huge.value) from small
> JOIN huge on huge.fips = small.fips WHERE (SUM(huge.value)) > 500 AND
> huge.pollutant='co';
>
> wonder if that makes sense. Obviously, can't have an aggregate in
> where clause, so I've tried a couple WHERE (SELECT) kind of things,
> nothing working out too well.

For this case HAVING will suffice:

select small.*, sum(huge.value)
from small
join huge on (huge.fips = small.fips)
where huge.pollutant='co'
having sum(huge.value) > 500

But in more complex cases you may have to use a subquery and further where
clauses or even joins outside the subquery.

You could write this, for example, as:

select *
from small join (
select fips,sum(huge.value) as sum
from huge
where pollutant='co'
) as huge_sum using (fips)
where huge_sum.sum > 500

Which may actually run faster (Unfortunately Postgres doesn't use the foreign
key relationship when planning so it can't reorder the join and the where
clause because it doesn't know that every "huge" record will have a matching
"small" record)

> any ideas on how I could speed up the query, I would be so extremely
> grateful.

You could try an index on <pollutant> or <fips,pollutant> but with the numbers
you're talking about they probably won't help unless you have a *lot* of
different pollutants and not all that many records for each pollutant.

To get anything more you would have to post an EXPLAIN output and preferably
an EXPLAIN ANALYZE output if you can find a query which completes.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Klint Gore 2008-05-29 00:17:03 Re: small table, huge table, and a join = slow and tough query. cake inside!
Previous Message Stephen Denne 2008-05-29 00:06:19 Re: small table, huge table, and a join = slow and tough query. cake inside!