Re: sub-query optimization

From: Brad Hilton <bhilton(at)vpop(dot)net>
To: Tomasz Myrta <jasiek(at)klaster(dot)net>
Cc: jasiek(at)serwer(dot)skawsoft(dot)com(dot)pl, pgsql-sql(at)postgresql(dot)org
Subject: Re: sub-query optimization
Date: 2003-02-14 22:39:31
Message-ID: 1045262371.29955.124.camel@aragorn.vpop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, 2003-02-14 at 04:59, Tomasz Myrta wrote:
> Brad Hilton wrote:
> <cut>
> > select * from articles where exists
> > (select 1 from article_categories, categories, category_map
> > where
> > article_categories.article_id = articles.id and
> > categories.restrict_views = FALSE and
> > article_categories.category_id = categories.id and
> > category_map.parent_id = 1 and
> > category_map.child_id = categories.id and
> > category_map.child_id = article_categories.category_id
> > )
> > and
> > post_status = 'publish'
>
> According to your table definition I can say, that you don't need subselect
> and exists, because 1 row from article and 1 row from categories have only 1
> hit row in articles_categories (primary key),

I don't think the article_categories primary key can be used in my query
since I'm also joining against category_map. Articles can live in
multiple categories. What my query is attempting is (in english terms):

select all articles that live in non-restricted categories at or below a
top-level category (id=1 in this case).

If I just utilize article_categories primary key, I could end up with
duplicate articles since articles can live in multiple categories.

> so you can rewrite your query
> as simple joins:
> (Query is only a hint, it probably won't work)
>
> select a.*
> from
> categories_c cross join category_map m
> join articles a on (child_id=category_id)
> join articles_categories ac using (article_id,category_id)
> where
> m.parent_id=1 and not c.restrict_views;
> and a.post_status='publish'
>

In case I'm not understanding your suggestiong perfectly, I tried to
flesh it out a bit more. Does the following query match your
suggestion?

select a.*
from
categories c cross join category_map m
join article_categories ac on (c.id = ac.category_id and m.child_id =
ac.category_id)
join articles a on (a.id = ac.article_id)
where
m.parent_id=1 and
not c.restrict_views and
m.child_id = c.id and
a.post_status='publish'

Unfortunately, this query returns duplicate articles (see explanation
above), and is fairly slow. Maybe I didn't follow your initial query
properly.

-Brad

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message George 2003-02-14 23:43:19
Previous Message Brad Hilton 2003-02-14 22:26:18 Re: sub-query optimization