Re: Wrong query plan when using a left outer join

From: Feike Steenbergen <feikesteenbergen(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Wrong query plan when using a left outer join
Date: 2012-01-17 18:55:55
Message-ID: CAK_s-G0MijfOu5JBPYhj0NjhHTe9fNc3UrOnCyM8=tUvaTQtkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

oops, but ofcourse, a natural view will not give the correct answer,
back to the drawing board ...

On Tue, Jan 17, 2012 at 19:53, Feike Steenbergen
<feikesteenbergen(at)gmail(dot)com> wrote:
>> BTW, add a foreign key and index on handhistory_plain.hand_id (unless> you have it already).
> It's there already:
>
> feiketracker=# \d+ handhistory_plain;
>            Table "stage.handhistory_plain"
>  Column  |  Type   | Modifiers | Storage  | Description
> ---------+---------+-----------+----------+-------------
>  hand_id | integer | not null  | plain    |
>  history | text    | not null  | extended |
> Indexes:
>    "handhistory_plain_pkey" PRIMARY KEY, btree (hand_id) CLUSTER
> Foreign-key constraints:
>    "handhistory_plain_hand_id_fkey" FOREIGN KEY (hand_id) REFERENCES
> hand_meta(hand_id)
>
>> BTW2, if you really don't care on handhistory you can just use
>> original query with no join.
>
> Well, sometimes I do, sometimes I don't. For easier application access
> I wanted to create a view that joins both these tables together:
> easier application design and better performance, as the analyzer
> should know best when not to use the handhistory_plain table.
>
>
> The design is as follows:
>
> hand_meta - holds all metadata for a pokerhand
> handhistory_plain holds the history for a pokerhand
>
> hand_meta is going to be used the most, it is around 165 bytes per tuple
> handhistory_plain is not going to be used often (it is there as a
> reference); it is around 5000 bytes per tuple.
>
> They both hold the same column as primary key, handhistory_plain holds
> a fraction of the tuples of hand_meta, the split was only made to make
> sure the processed data (hand_meta) is smaller in size and should
> therefore require less I/O and thus increase performance.
>
> I'm not sure what to make of:
>> imagine that the view on the right side of join has some side effects.
> I can see some side effects may occur, but as it is a left join, the
> left hand side will always be part of the returning set (there is no
> where clause), so the index should be used.
> Even though I don't understand, you seem to be right, a natural join
> is 30 times faster:
>
> feiketracker=# explain analyze select max(hand_id) from hand_meta left
> join handhistory_plain using(hand_id);
>                                                                 QUERY
> PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=10000049261.00..10000049261.01 rows=1 width=4)
> (actual time=31179.238..31179.241 rows=1 loops=1)
>   ->  Seq Scan on hand_meta  (cost=10000000000.00..10000043062.40
> rows=2479440 width=4) (actual time=0.131..16039.886 rows=2479440
> loops=1)
>  Total runtime: 31179.725 ms
> (3 rows)
>
> Time: 31185.088 ms
>
> feiketracker=# explain analyze select max(hand_id) from hand_meta join
> handhistory_plain using(hand_id);
>
>     QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=53043.61..53043.62 rows=1 width=4) (actual
> time=962.242..962.245 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..53029.93 rows=5470 width=4) (actual
> time=0.400..920.582 rows=5470 loops=1)
>         ->  Index Scan using handhistory_plain_pkey on
> handhistory_plain  (cost=0.00..14494.27 rows=5470 width=4) (actual
> time=0.215..101.177 rows=5470 loops=1)
>         ->  Index Scan using hand_meta_pkey on hand_meta
> (cost=0.00..7.03 rows=1 width=4) (actual time=0.100..0.115 rows=1
> loops=5470)
>               Index Cond: (hand_meta.hand_id = handhistory_plain.hand_id)
>  Total runtime: 962.968 ms
>
>
>> try to experiment with SET enable_seqscan TO false; - and see what happens.
> Didn't make a difference; therefore I think postgres determines it is
> unable to use the index, is that correct?
>
>
> Thank you for now: I'll use the inner join (or natural join in this
> case) for this specific view

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Rosser Schwarz 2012-01-17 19:13:45 Re: Wrong query plan when using a left outer join
Previous Message Feike Steenbergen 2012-01-17 18:53:50 Re: Wrong query plan when using a left outer join