optimizing away join when querying view

From: Jacob Costello <jake(at)suntradingllc(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: optimizing away join when querying view
Date: 2006-02-08 13:54:51
Message-ID: 1139406891.11389.60.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Postgres doesn't seem to optimize away unnecessary joins in a view
definition when the view is queried in such a way that the join need not
be executed. In the example below, I define two tables, foo and bar,
with a foreign key on bar referencing foo, and a view on the natural
join of the tables. The tables are defined so that the relationship
from bar to foo is allowed to be many to one, with the column of bar
referencing foo (column a) set NOT NULL, so that there must be exactly
one foo record for every bar record. I then EXPLAIN selecting the "b"
column from bar, through the view and from bar directly. The tables
have been ANALYZEd but have no data. EXPLAIN shows the join actually
occurring when selecting b from the view quux. If I understand
correctly (maybe I don't), this is guaranteed to be exactly the same as
the selecting b directly from the bar table. The practical import of
this comes into play when views are provided to simplify queries for end
users, and those views use joins to include related data. If the user
enters a query that is equivalent to a query on a base table, why should
the query pay a performance penalty ?

table foo:

Column | Type | Modifiers
--------+---------+-----------
a | integer | not null
Indexes:
"foo_pkey" PRIMARY KEY, btree (a)

table bar:

Column | Type | Modifiers
--------+---------+-----------
a | integer | not null
b | integer |
Foreign-key constraints:
"bar_a_fkey" FOREIGN KEY (a) REFERENCES foo(a)

view quux:

Column | Type | Modifiers
--------+---------+-----------
a | integer |
b | integer |
View definition:
SELECT bar.a, bar.b
FROM bar
NATURAL JOIN foo

EXPLAINed Queries:

explain select b from bar;

QUERY PLAN
---------------------------------------------------
Seq Scan on bar (cost=0.00..1.00 rows=1 width=4)
(1 row)

explain select b from quux;

QUERY PLAN
--------------------------------------------------------------------------
Nested Loop (cost=0.00..5.84 rows=1 width=4)
-> Seq Scan on bar (cost=0.00..1.00 rows=1 width=8)
-> Index Scan using foo_pkey on foo (cost=0.00..4.82 rows=1
width=4)
Index Cond: ("outer".a = foo.a)
(4 rows)

--
Jacob Costello <jake(at)suntradingllc(dot)com>
Sun Trading, LLC

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ron 2006-02-08 14:11:11 Size and performance hit from using UTF8 vs. ASCII?
Previous Message Markus Schaber 2006-02-08 13:38:08 Re: Default autovacuum settings too conservative