Re: Wrong query plan when using a left outer join

From: Feike Steenbergen <feikesteenbergen(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Wrong query plan when using a left outer join
Date: 2012-01-17 18:53:50
Message-ID: CAK_s-G3tj7gv0WhVWCP5K6bzF874KYWH87_fbAEk3no9jbUzbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> BTW, add a foreign key and index on handhistory_plain.hand_id (unless> you have it already).
It's there already:

feiketracker=# \d+ handhistory_plain;
Table "stage.handhistory_plain"
Column | Type | Modifiers | Storage | Description
---------+---------+-----------+----------+-------------
hand_id | integer | not null | plain |
history | text | not null | extended |
Indexes:
"handhistory_plain_pkey" PRIMARY KEY, btree (hand_id) CLUSTER
Foreign-key constraints:
"handhistory_plain_hand_id_fkey" FOREIGN KEY (hand_id) REFERENCES
hand_meta(hand_id)

> BTW2, if you really don't care on handhistory you can just use
> original query with no join.

Well, sometimes I do, sometimes I don't. For easier application access
I wanted to create a view that joins both these tables together:
easier application design and better performance, as the analyzer
should know best when not to use the handhistory_plain table.

The design is as follows:

hand_meta - holds all metadata for a pokerhand
handhistory_plain holds the history for a pokerhand

hand_meta is going to be used the most, it is around 165 bytes per tuple
handhistory_plain is not going to be used often (it is there as a
reference); it is around 5000 bytes per tuple.

They both hold the same column as primary key, handhistory_plain holds
a fraction of the tuples of hand_meta, the split was only made to make
sure the processed data (hand_meta) is smaller in size and should
therefore require less I/O and thus increase performance.

I'm not sure what to make of:
> imagine that the view on the right side of join has some side effects.
I can see some side effects may occur, but as it is a left join, the
left hand side will always be part of the returning set (there is no
where clause), so the index should be used.
Even though I don't understand, you seem to be right, a natural join
is 30 times faster:

feiketracker=# explain analyze select max(hand_id) from hand_meta left
join handhistory_plain using(hand_id);
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=10000049261.00..10000049261.01 rows=1 width=4)
(actual time=31179.238..31179.241 rows=1 loops=1)
-> Seq Scan on hand_meta (cost=10000000000.00..10000043062.40
rows=2479440 width=4) (actual time=0.131..16039.886 rows=2479440
loops=1)
Total runtime: 31179.725 ms
(3 rows)

Time: 31185.088 ms

feiketracker=# explain analyze select max(hand_id) from hand_meta join
handhistory_plain using(hand_id);

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=53043.61..53043.62 rows=1 width=4) (actual
time=962.242..962.245 rows=1 loops=1)
-> Nested Loop (cost=0.00..53029.93 rows=5470 width=4) (actual
time=0.400..920.582 rows=5470 loops=1)
-> Index Scan using handhistory_plain_pkey on
handhistory_plain (cost=0.00..14494.27 rows=5470 width=4) (actual
time=0.215..101.177 rows=5470 loops=1)
-> Index Scan using hand_meta_pkey on hand_meta
(cost=0.00..7.03 rows=1 width=4) (actual time=0.100..0.115 rows=1
loops=5470)
Index Cond: (hand_meta.hand_id = handhistory_plain.hand_id)
Total runtime: 962.968 ms

> try to experiment with SET enable_seqscan TO false; - and see what happens.
Didn't make a difference; therefore I think postgres determines it is
unable to use the index, is that correct?

Thank you for now: I'll use the inner join (or natural join in this
case) for this specific view

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Feike Steenbergen 2012-01-17 18:55:55 Re: Wrong query plan when using a left outer join
Previous Message Thomas Kellerer 2012-01-17 16:28:40 Re: UPDATE COMPATIBILITY