Re: How to optimize SQL query ?

From: Cédric Dufour (Cogito Ergo Soft) <cedric(dot)dufour(at)cogito-ergo-soft(dot)com>
To: "Milosz Krajewski" <Milosz(dot)Krajewski(at)FinSkog(dot)com(dot)pl>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: How to optimize SQL query ?
Date: 2002-08-02 13:17:29
Message-ID: NDBBIFNBODNADCAOFDOAIEJFCDAA.cedric.dufour@cogito-ergo-soft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> -----Original Message-----
> From: pgsql-sql-owner(at)postgresql(dot)org
> [mailto:pgsql-sql-owner(at)postgresql(dot)org]On Behalf Of Milosz Krajewski
> Sent: Monday, July 29, 2002 16:50
> To: pgsql-sql(at)postgresql(dot)org
> Subject: [SQL] How to optimize SQL query ?
>
>
> How to optimize query or just force postgre to do it my way ?
>
> Example:
> table continets (
> id numeric, ..., active numeric
> );
>
> table countries (
> id numeric, id_continent numeric, ..., active numeric
> );
>
> table cities (
> id numeric, id_country numeric, ..., active numeric
> );
>
> relations:
> cities.id_county are in countries.id
> countries.id_continent are on continents.id
>
> Query:
> table temp_cities (
> id_city numeric;
> );
>
>
> temp_cities is temp table which holds few (~20) id of cities, to
> show them.
>
> so:
> select * from cities
> where cities.id in (select id_city from temp_cities);

AVOID: indexes (which you should have defined on primary keys [implicitely
defined by PostgreSQL] and foreign keys [must be defined explicitely]) are
not used

> or:
> select * from cities, temp_cities tmp
> where cities.id = tmp.id_city;

BETTER ;-)

> works fine.
>
> But the problem starts here:
>
> select * from cities, coutries, continets
> where
> (cities.id in (select id_city from temp_cities)) and
> (cities.id_county = countries.id) and
> (countries.id_continent = continents.id) and
> (cities.active = 1) and (coutries.active = 1) and
> (continents.active = 1)
>
> (active means is row active or archive, many of them are active,
> but I have to check it)
>
> Posgre is planning it like this:
> joins cities with coutries
> joins countries with continents
> selects active
> filtering with cities.id (with temp_cities)
>
> If I could force it to filter cities.id first
> (I can do this with Oracle by changing
> "select id_city from temp_cities" to
> "select id_city from temp_cities group by id_city")
> it will work much (1000x) faster.
>
> Can I force postgre do it my way ?

Use the explicit JOIN syntax and join each table one after another in the
order you feel is the more adequate for your query. PostgreSQL will respect
this order.

From one I understand, you should write it this way:
SELECT
*
FROM
continents
INNER JOIN
countries
ON ( continents.id = country.id_continent )
INNER JOIN
cities
ON ( countries.id = cities.id_country )
INNER JOIN
temp_cities
ON ( cities.id = temp_cities.id )
WHERE
( continents.active = 1 )
AND ( countries.active = 1 )
AND ( cities.active = 1 )

The reason to do so are:
1. Joining first on the tables that contain the less rows contributes to
keep the cartesian product between the joins as low as possible
2. Thus if a continent - respectively country - is not active, it will be
casted out from the join immediately and thus reduce the cartesian product
for the next join(s)
3. Joining on 'temp-cities' allows the usage of the hopefully defined index

I achieved ratio from 10000 to 1 respecting this strategy on a scenario
fairly closed to yours ;-)

Ce.D

> --
> [ Milosz "Krashan" Krajewski ][ mail: vilge(at)mud(dot)org(dot)pl, UIN: 1319535 ]
> [ inet: Vilge, Vilgefortz ][ www: http://www.ds2.pg.gda.pl/~krash ]
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2002-08-02 13:47:20 Re: Group And Sort After Union
Previous Message Cédric Dufour (Cogito Ergo Soft) 2002-08-02 12:59:44 Re: Returning PK of first insert for second insert use.