Re: improvements to query with hierarchical elements

From: Steve Midgley <public(at)misuse(dot)org>
To: pgsql-sql(at)postgresql(dot)org
Cc: Ryan Wallace <rywall(at)interchange(dot)ubc(dot)ca>
Subject: Re: improvements to query with hierarchical elements
Date: 2008-01-22 05:21:20
Message-ID: 20080122052133.65CF22E3881@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


>Date: Sun, 20 Jan 2008 20:01:08 -0800
>From: Ryan Wallace <rywall(at)interchange(dot)ubc(dot)ca>
>To: pgsql-sql(at)postgresql(dot)org
>Subject: improvements to query with hierarchical elements
>Message-ID: <002601c85be2$410aea30$c320be90$(at)ubc(dot)ca>
>Greetings,
>
>I have a complex query which I am trying to figure out the most
>efficient
>way of performing.
>
>My database is laid out as follows:
>items -have_many-> events -have_many-> event_locations -have_many->
>locations
>
>also rows in the location_links table link two locations together in a
>parent-child relationship and rows in the location_descendants table
>provide
>a full list of the descendants of a
>particular location.
>
>I am trying to find all locations which both are direct children of a
>given
>parent location, and are associated with at least one item in a
>constrained
>subset of items.
>(eg. Find all states of the USA in which at least one wooden axe was
>made.
>Also find the number of wooden axes made in each state.)
>
>I have developed the following query:
>
>SELECT locations.*,
> location_ids.item_count AS item_count
>FROM locations
> JOIN
> (SELECT immediate_descendants.ancestor_id AS id,
> COUNT(DISTINCT creation_events.item_id) AS
>item_count
> FROM event_locations
> JOIN
> (SELECT *
> FROM location_descendants
> WHERE ancestor_id IN
> (SELECT child_id
> FROM location_links
> WHERE parent_id = *note 1*
> )
> ) AS immediate_descendants
> ON event_locations.location_id =
>immediate_descendants.descendant_id
> JOIN
> (SELECT *
> FROM events
> WHERE item_id IN (*note 2*) AND
>association = 'creation'
> ) AS creation_events
> ON event_locations.event_id =
>creation_events.id
> GROUP BY immediate_descendants.ancestor_id
> ) AS location_ids ON locations.id = location_ids.id
>
>*note 1* - the id of the parent location.
>*note 2* - the query which returns a list of constrained item ids
>
>This works but I am looking for any way to improve the performance of
>the
>query (including changing the layout of the tables). Any ideas,
>suggestions
>or general pointers would be greatly appreciated.
>
>Thanks very much,
>Ryan

Hi Ryan,

I have built some similar queries so I might be able to help you. But
it's a little hard (for me) to dig into your query without a test set.
Could you please post some create table and insert statements to give
us a little test bed to run your query in? I realize that may be a fair
bit of work for you but it would help me to give you some ideas.

Without seeing a more formal schema and being able to toy with it, I'm
not sure I can give good advice. Others may have different opinions
which I would welcome.

Sincerely,

Steve

Browse pgsql-sql by date

  From Date Subject
Next Message silly_sad 2008-01-22 07:16:30 currval() within one statement
Previous Message Christian Schröder 2008-01-21 17:39:08 Re: (possible) bug with constraint exclusion