DIfferent plans for explicit versus implicit join using link table

From: "John D(dot) Burger" <john(at)mitre(dot)org>
To: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: DIfferent plans for explicit versus implicit join using link table
Date: 2007-04-26 17:22:18
Message-ID: 7ABD4750-BFC5-4E5A-BD89-2EAF8478B284@mitre.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi -

I have a table of words and a table linking words in various ways:

create table allWords (
wordID serial PRIMARY KEY,
word text
);
create unique index ix_allwords_word ON allwords (word);

create table allWordRelations (
word1ID integer references allWords,
word2ID integer references allWords,
pos1 integer references posTypes,
pos2 integer references posTypes,
relID integer references allRelationTypes,
confidence float,
primary key (word1ID, word2ID, pos1, pos2, relID)
);
create index ix_allWordRelations_word1_pos1 on allWordRelations
(word1ID, pos1);
create index ix_allWordRelations_word2_pos2 on allWordRelations
(word2ID, pos2);

I have two queries for looking up related words which I think should
be equivalent, but 7.4.8 comes up with very different plans. The
first query joins the word table to itself explicitly via the
relations table - this is very fast. The second query uses an IN
against the link table in the where clause, and is very slow. I'm
sure I can affect this by adding indexes, but I'm mainly trying to
understand what difference the planner is seeing. EXPLAIN ANALYZE
output is below - can anyone explain? Are my two queries subtly
different in terms of NULLs, or something like that? Thanks.

- John Burger
MITRE

explain analyze select w2.word from allwords w1 join allwordrelations
as r on (w1.wordid = r.word1id) join allwords w2 on (w2.wordid =
r.word2id) where w1.word = 'dogging';

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
--------------------
Nested Loop (cost=0.00..579.05 rows=81 width=15) (actual
time=0.607..30.509 rows=59 loops=1)
-> Nested Loop (cost=0.00..333.94 rows=81 width=4) (actual
time=0.564..29.032 rows=59 loops=1)
-> Index Scan using ix_allwords_word on allwords w1
(cost=0.00..3.49 rows=1 width=4) (actual time=0.326..0.329 rows=1
loops=1)
Index Cond: (word = 'dogging'::text)
-> Index Scan using ix_allwordrelations_word1_pos1 on
allwordrelations r (cost=0.00..329.36 rows=87 width=8) (actual
time=0.220..28.564 rows=59 loops=1)
Index Cond: ("outer".wordid = r.word1id)
-> Index Scan using allwords_pkey on allwords w2
(cost=0.00..3.01 rows=1 width=19) (actual time=0.018..0.020 rows=1
loops=59)
Index Cond: (w2.wordid = "outer".word2id)
Total runtime: 30.713 ms

explain analyze select w2.word from allwords w1, allwords w2 where
(w1.wordid, w2.wordid) in (select word1id, word2id from
allwordrelations ) and w1.word = 'dogging';

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-----
Nested Loop (cost=760422.86..817628.29 rows=1 width=15) (actual
time=99277.403..111291.862 rows=59 loops=1)
-> Hash Join (cost=760422.86..817625.27 rows=1 width=4) (actual
time=99277.110..111270.093 rows=59 loops=1)
Hash Cond: ("outer".word1id = "inner".wordid)
-> Unique (cost=760419.36..794740.32 rows=4576128
width=8) (actual time=96713.791..107843.446 rows=4302242 loops=1)
-> Sort (cost=760419.36..771859.68 rows=4576128
width=8) (actual time=96713.785..102973.088 rows=4576035 loops=1)
Sort Key: allwordrelations.word1id,
allwordrelations.word2id
-> Seq Scan on allwordrelations
(cost=0.00..79409.28 rows=4576128 width=8) (actual
time=0.008..8668.255 rows=4576035 loops=1)
-> Hash (cost=3.49..3.49 rows=1 width=4) (actual
time=0.078..0.078 rows=0 loops=1)
-> Index Scan using ix_allwords_word on allwords w1
(cost=0.00..3.49 rows=1 width=4) (actual time=0.067..0.070 rows=1
loops=1)
Index Cond: (word = 'dogging'::text)
-> Index Scan using allwords_pkey on allwords w2
(cost=0.00..3.01 rows=1 width=19) (actual time=0.360..0.363 rows=1
loops=59)
Index Cond: (w2.wordid = "outer".word2id)
Total runtime: 111292.449 ms

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jon Sime 2007-04-26 18:12:41 Re: Feature request - have postgresql log warning when new sub-release comes out.
Previous Message Gerhard Wiesinger 2007-04-26 17:02:36 Re: Fw: PostgreSQL Performance Tuning