From: | "Jim C(dot) Nasby" <jim(at)nasby(dot)net> |
---|---|
To: | "John D(dot) Burger" <john(at)mitre(dot)org> |
Cc: | PostgreSQL general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Finding uniques across a big join |
Date: | 2005-11-30 18:13:01 |
Message-ID: | 20051130181301.GG13642@nasby.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
It will probably be a win to come up with a list of potential records
from each table, instead of after doing the 3-way join. so something
like:
(SELECT gazPlaceID FROM gazPlaces GROUP BY featureType HAVING count(*)=1)
JOIN
(SELECT ...)
If you post the output of explain (or explain analyze is even better)
then people could probably make better suggestions.
On Tue, Nov 29, 2005 at 09:58:49PM -0500, John D. Burger wrote:
> I could use some help with the following:
>
> I have a database of geographic entities with attributes spread across
> several tables. I need to determine which entities are unique with
> respect to some of those attributes. I'm using the following query:
>
> select p2.gazPlaceID from
> (select p1.featureType, n1.placeNameID, c1.containerID
> from gazPlaces as p1
> join gazNamings as n1 using (gazPlaceID)
> join gazContainers as c1 using (gazPlaceID)
> group by p1.featureType, n1.placeNameID, c1.containerID
> having count(*) = 1) as uniqs,
> gazPlaces as p2
> join gazNamings as n2 using (gazPlaceID)
> join gazContainers as c2 using (gazPlaceID)
> where uniqs.featureType = p2.featureType
> and uniqs.placeNameID = n2.placeNameID
> and uniqs.containerID = c2.containerID;
>
> The basic idea is to compute featureType-placeNameID-containerID
> combinations with a three-way join, determine which of those have a
> count of 1, and then join that back to the same three-way join to get
> the gazPlaceIDs corresponding to the unique combos (whew!).
>
> gazPlaces has about 6M entries, gazNamings and gazContainers each about
> 10M. All of the fields above are integers, and I have indexes on
> everything relevant, but the query still takes about eight hours. My
> question is not (necessarily) how to improve the efficiency of this
> query, but whether anyone can think of a faster way to compute the
> uniques. Again, the goal is to find entries in gazPlaces that are the
> only ones with their particular combination of feature type, name and
> container.
>
> Any help is appreciated!
>
> - John Burger
> MITRE
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
>
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
From | Date | Subject | |
---|---|---|---|
Next Message | frank church | 2005-11-30 18:18:01 | How to check options PostgreSQL was started with |
Previous Message | frank church | 2005-11-30 18:08:06 | How to change database owner in 7.4 |