Re: Finding uniques across a big join

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: "John D(dot) Burger" <john(at)mitre(dot)org>
Cc: PostgreSQL general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Finding uniques across a big join
Date: 2005-11-30 18:13:01
Message-ID: 20051130181301.GG13642@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

It will probably be a win to come up with a list of potential records
from each table, instead of after doing the 3-way join. so something
like:

(SELECT gazPlaceID FROM gazPlaces GROUP BY featureType HAVING count(*)=1)
JOIN
(SELECT ...)

If you post the output of explain (or explain analyze is even better)
then people could probably make better suggestions.

On Tue, Nov 29, 2005 at 09:58:49PM -0500, John D. Burger wrote:
> I could use some help with the following:
>
> I have a database of geographic entities with attributes spread across
> several tables. I need to determine which entities are unique with
> respect to some of those attributes. I'm using the following query:
>
> select p2.gazPlaceID from
> (select p1.featureType, n1.placeNameID, c1.containerID
> from gazPlaces as p1
> join gazNamings as n1 using (gazPlaceID)
> join gazContainers as c1 using (gazPlaceID)
> group by p1.featureType, n1.placeNameID, c1.containerID
> having count(*) = 1) as uniqs,
> gazPlaces as p2
> join gazNamings as n2 using (gazPlaceID)
> join gazContainers as c2 using (gazPlaceID)
> where uniqs.featureType = p2.featureType
> and uniqs.placeNameID = n2.placeNameID
> and uniqs.containerID = c2.containerID;
>
> The basic idea is to compute featureType-placeNameID-containerID
> combinations with a three-way join, determine which of those have a
> count of 1, and then join that back to the same three-way join to get
> the gazPlaceIDs corresponding to the unique combos (whew!).
>
> gazPlaces has about 6M entries, gazNamings and gazContainers each about
> 10M. All of the fields above are integers, and I have indexes on
> everything relevant, but the query still takes about eight hours. My
> question is not (necessarily) how to improve the efficiency of this
> query, but whether anyone can think of a faster way to compute the
> uniques. Again, the goal is to find entries in gazPlaces that are the
> only ones with their particular combination of feature type, name and
> container.
>
> Any help is appreciated!
>
> - John Burger
> MITRE
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
>

--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message frank church 2005-11-30 18:18:01 How to check options PostgreSQL was started with
Previous Message frank church 2005-11-30 18:08:06 How to change database owner in 7.4