improving a badly optimized query

From: Brandon Craig Rhodes <brandon(at)oit(dot)gatech(dot)edu>
To: pgsql-general(at)postgresql(dot)org
Subject: improving a badly optimized query
Date: 2002-11-19 18:09:06
Message-ID: w6lm3pe8jh.fsf@guinness.ts.gatech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

A query has surprised me by the amount of time it takes, and it seems
that PostgreSQL is performing insufficient optimization. To make the
example simple, consider the following database:

CREATE TABLE role_keys (
role SERIAL PRIMARY KEY
);
CREATE TABLE role_person (
role INTEGER UNIQUE NOT NULL REFERENCES role_keys,
person INTEGER NOT NULL
);
CREATE INDEX role_person_index ON role_person (person);
CREATE VIEW roles AS SELECT
role_keys.role, person
FROM role_keys NATURAL LEFT JOIN role_person;

Having populated these tables, I attempted the following query:

SELECT * FROM roles WHERE person = 28389;

It turns out that this query - equivalent to query (a) shown below -
takes more than ten times the amount of time required by query (b),
despite being guaranteed to give exactly the same result!

(a) (slow)
SELECT * FROM role_keys NATURAL LEFT JOIN role_person
WHERE person = 28389;

(b) (fast)
SELECT * FROM role_keys NATURAL JOIN role_person
WHERE person = 28389;

Apparently PostgreSQL does not realize that the rows created for
unmatched role_keys rows by the LEFT JOIN are guaranteed to be thrown
out by the WHERE clause (their `person' fields will be null). Because
of this it reads through the entire role_keys table:

(a) (when run with EXPLAIN)
Merge Join (cost=0.00..3990.83 rows=67524 width=12)
-> Index Scan using role_keys_pkey on role_keys
(cost=0.00..1280.67 rows=67524 width=4)
-> Index Scan using role_person_role_key on role_person
(cost=0.00..1359.68 rows=67525 width=8)

(b) (when run with EXPLAIN)
Nested Loop (cost=0.00..6.91 rows=1 width=12)
-> Index Scan using role_person_index on role_person
(cost=0.00..3.02 rows=1 width=8)
-> Index Scan using role_keys_pkey on role_keys
(cost=0.00..3.01 rows=1 width=4)

It is not obvious to me where in PostgreSQL's optimization routine to
insert the intelligence to reduce this from a `LEFT JOIN' to a `JOIN'.
Has anyone else had to deal with this case?

The VIEW itself must be a LEFT JOIN because I need all roles to appear
when I query the view; but I will frequently need to do queries like
the above, and would like to avoid either (a) having to create a
separate view for each combination of fields on which I might search,
or (b) querying using the raw database tables since I would like the
actualy design hidden from my business logic.

Thanks for any ideas,
--
Brandon Craig Rhodes http://www.rhodesmill.org/brandon
Georgia Tech brandon(at)oit(dot)gatech(dot)edu

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Medi Montaseri 2002-11-19 18:15:21 Re: stability of pg library usage
Previous Message Medi Montaseri 2002-11-19 17:48:30 Re: help in starting up / shutting down postgres as another