Skip site navigation (1) Skip section navigation (2)

BUG #3012: Wrong JOIN order when a JOIN depends on result from a LEFT JOIN.

From: "Pelle Johansson" <morth(at)morth(dot)org>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #3012: Wrong JOIN order when a JOIN depends on result from a LEFT JOIN.
Date: 2007-02-15 09:55:48
Message-ID: 200702150955.l1F9tm1V054650@wwwmaster.postgresql.org (view raw or flat)
Thread:
Lists: pgsql-bugs
The following bug has been logged online:

Bug reference:      3012
Logged by:          Pelle Johansson
Email address:      morth(at)morth(dot)org
PostgreSQL version: 8.2.3
Operating system:   RedHat CentOS 4.4
Description:        Wrong JOIN order when a JOIN depends on result from a
LEFT JOIN.
Details: 

We have a join where we select which row to join on a subquery with a
coalesce on a column from a left join, which is not working as expected.
Instead of getting one result row, we get one for each row matching the
other conditions in the join (as if the subselect didn't exist), after that
the left join is applied.

Steps to reproduce:
CREATE TABLE ta (ta_id integer PRIMARY KEY);
CREATE TABLE tb (ta_id integer NOT NULL, tb_id integer NOT NULL, type
varchar NOT NULL, ts timestamp NOT NULL);

INSERT INTO ta VALUES (1);
INSERT INTO tb VALUES (1, 1, 'appear', '2007-02-15 00:00:00');
INSERT INTO tb VALUES (1, 2, 'delete', '2007-02-15 00:00:01');
INSERT INTO tb VALUES (1, 3, 'appear', '2007-02-15 00:00:02');


SELECT
	*
FROM
	ta
	LEFT JOIN (
		SELECT
			ta_id,
			max(ts) AS ts,
			max(tb_id) AS tb_id
		FROM
			tb
		WHERE
			type = 'delete'
		GROUP BY
			ta_id
		) AS last_delete USING (ta_id)
	JOIN tb AS appear ON
		appear.ta_id = ta.ta_id
		AND appear.tb_id = (
			SELECT
				min(tb_id)
			FROM
				tb
			WHERE
				ta_id = ta.ta_id
				AND type != 'delete'
				AND coalesce(ts > last_delete.ts, true)
			);
 ta_id |         ts          | tb_id | ta_id | tb_id |  type  |         ts  
       
-------+---------------------+-------+-------+-------+--------+-------------
--------
     1 |                     |       |     1 |     1 | appear | 2007-02-15
00:00:00
     1 |                     |       |     1 |     2 | delete | 2007-02-15
00:00:01
     1 | 2007-02-15 00:00:01 |     2 |     1 |     3 | appear | 2007-02-15
00:00:02
(3 rows)

EXPLAIN SELECT ....
                                                 QUERY PLAN                 
                                
----------------------------------------------------------------------------
---------------------------------
 Hash Left Join  (cost=80.64..146.46 rows=990 width=64)
   Hash Cond: (ta.ta_id = last_delete.ta_id)
   Join Filter: (appear.tb_id = (subplan))
   ->  Hash Join  (cost=58.15..92.90 rows=990 width=52)
         Hash Cond: (appear.ta_id = ta.ta_id)
         ->  Seq Scan on tb appear  (cost=0.00..19.90 rows=990 width=48)
         ->  Hash  (cost=31.40..31.40 rows=2140 width=4)
               ->  Seq Scan on ta  (cost=0.00..31.40 rows=2140 width=4)
   ->  Hash  (cost=22.46..22.46 rows=2 width=16)
         ->  Subquery Scan last_delete  (cost=22.41..22.46 rows=2 width=16)
               ->  HashAggregate  (cost=22.41..22.44 rows=2 width=16)
                     ->  Seq Scan on tb  (cost=0.00..22.38 rows=5 width=16)
                           Filter: (("type")::text = 'delete'::text)
   SubPlan
     ->  Aggregate  (cost=27.34..27.35 rows=1 width=4)
           ->  Seq Scan on tb  (cost=0.00..27.32 rows=5 width=4)
                 Filter: ((ta_id = $0) AND (("type")::text <>
'delete'::text) AND COALESCE((ts > $1), true))
(17 rows)


Workaround:
The expected result of one result row can be obtained either by changing the
JOIN to a LEFT JOIN or by setting join_collapse_limit = 1:

SET join_collapse_limit = 1;
SELECT ....
 ta_id |         ts          | tb_id | ta_id | tb_id |  type  |         ts  
       
-------+---------------------+-------+-------+-------+--------+-------------
--------
     1 | 2007-02-15 00:00:01 |     2 |     1 |     3 | appear | 2007-02-15
00:00:02
(1 row)

                                                 QUERY PLAN                 
                                
----------------------------------------------------------------------------
---------------------------------
 Merge Join  (cost=91.63..27264.54 rows=5 width=64)
   Merge Cond: (ta.ta_id = appear.ta_id)
   Join Filter: (appear.tb_id = (subplan))
   ->  Merge Left Join  (cost=22.47..96.20 rows=2140 width=16)
         Merge Cond: (ta.ta_id = last_delete.ta_id)
         ->  Index Scan using ta_pkey on ta  (cost=0.00..68.35 rows=2140
width=4)
         ->  Sort  (cost=22.47..22.48 rows=2 width=16)
               Sort Key: last_delete.ta_id
               ->  Subquery Scan last_delete  (cost=22.41..22.46 rows=2
width=16)
                     ->  HashAggregate  (cost=22.41..22.44 rows=2 width=16)
                           ->  Seq Scan on tb  (cost=0.00..22.38 rows=5
width=16)
                                 Filter: (("type")::text = 'delete'::text)
   ->  Sort  (cost=69.16..71.63 rows=990 width=48)
         Sort Key: appear.ta_id
         ->  Seq Scan on tb appear  (cost=0.00..19.90 rows=990 width=48)
   SubPlan
     ->  Aggregate  (cost=27.34..27.35 rows=1 width=4)
           ->  Seq Scan on tb  (cost=0.00..27.32 rows=5 width=4)
                 Filter: ((ta_id = $0) AND (("type")::text <>
'delete'::text) AND COALESCE((ts > $1), true))
(19 rows)


Postgres 8.1.4 also provides the correct result, without any tweaking.

TIA,
Pelle Johansson

Responses

pgsql-bugs by date

Next:From: Phil FrostDate: 2007-02-15 14:43:17
Subject: Re: Segfaults and assertion failures with not too extraordinary views and queries
Previous:From: Tom LaneDate: 2007-02-15 03:13:56
Subject: Re: Segfaults and assertion failures with not too extraordinary views and queries

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group