Performance of full outer join in 8.3

From: Christian Schröder <cs(at)deriva(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Performance of full outer join in 8.3
Date: 2009-04-15 11:03:53
Message-ID: 49E5BF19.2040209@deriva.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi list,
we have just migrated one of our databases from 8.2.12 to 8.3.7. We now
experience a strange problem: A query that was really fast on the 8.2
server is now much slower on the 8.3 server (1 ms vs. 60 sec). I had a
look at the query plan and it is completely different. Both servers run
on the same machine. The configuration (planner constants etc.) is
identical. The database has been vacuum analyzed after the migration. So
why the difference?

This is the query:
select isin from ts_frontend.attachment_isins full OUTER JOIN
ts_frontend.rec_isins using (attachment,isin) WHERE attachment=2698120
GROUP BY isin limit 1000;

Here is the explain analyze in 8.2:


QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=826.44..826.61 rows=17 width=32) (actual time=0.163..0.172
rows=2 loops=1)
-> HashAggregate (cost=826.44..826.61 rows=17 width=32) (actual
time=0.159..0.162 rows=2 loops=1)
-> Merge Full Join (cost=799.62..826.40 rows=17 width=32)
(actual time=0.122..0.144 rows=2 loops=1)
Merge Cond: (("outer"."?column3?" = "inner"."?column3?")
AND (attachment_isins.attachment = rec_isins.attachment))
Filter: (COALESCE(attachment_isins.attachment,
rec_isins.attachment) = 2698120)
-> Sort (cost=13.39..13.74 rows=138 width=20) (actual
time=0.065..0.067 rows=1 loops=1)
Sort Key: (attachment_isins.isin)::bpchar,
attachment_isins.attachment
-> Index Scan using
attachment_isins_attachment_idx on attachment_isins (cost=0.00..8.49
rows=138 width=20) (actual time=0.042..0.047 rows=1 loops=1)
Index Cond: (attachment = 2698120)
-> Sort (cost=786.23..794.80 rows=3429 width=20)
(actual time=0.045..0.049 rows=2 loops=1)
Sort Key: (rec_isins.isin)::bpchar,
rec_isins.attachment
-> Index Scan using idx_rec_isins_attachment on
rec_isins (cost=0.00..584.89 rows=3429 width=20) (actual
time=0.019..0.024 rows=2 loops=1)
Index Cond: (attachment = 2698120)
Total runtime: 0.302 ms
(14 rows)

And this is the 8.3 plan:

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=345890.35..345900.35 rows=1000 width=26) (actual
time=53926.706..53927.071 rows=2 loops=1)
-> HashAggregate (cost=345890.35..346296.11 rows=40576 width=26)
(actual time=53926.702..53927.061 rows=2 loops=1)
-> Merge Full Join (cost=71575.91..345788.91 rows=40576
width=26) (actual time=10694.727..53926.559 rows=2 loops=1)
Merge Cond: (((rec_isins.isin)::bpchar =
(attachment_isins.isin)::bpchar) AND (rec_isins.attachment =
attachment_isins.attachment))
Filter: (COALESCE(attachment_isins.attachment,
rec_isins.attachment) = 2698120)
-> Index Scan using rec_isin_pkey on rec_isins
(cost=0.00..229562.97 rows=8115133 width=17) (actual
time=0.141..18043.605 rows=8036226 loops=1)
-> Materialize (cost=71575.91..78318.19 rows=539383
width=17) (actual time=10181.074..14471.215 rows=539101 loops=1)
-> Sort (cost=71575.91..72924.36 rows=539383
width=17) (actual time=10181.064..13019.906 rows=539101 loops=1)
Sort Key: attachment_isins.isin,
attachment_isins.attachment
Sort Method: external merge Disk: 18936kB
-> Seq Scan on attachment_isins
(cost=0.00..13111.83 rows=539383 width=17) (actual time=0.036..912.963
rows=539101 loops=1)
Total runtime: 53937.213 ms
(12 rows)

These are the table definitions:
Table "ts_frontend.attachment_isins"
Column | Type | Modifiers
--------------+--------------------------------+-----------
attachment | integer | not null
isin | isin | not null
editor | name |
last_changed | timestamp(0) without time zone |
Indexes:
"attachment_isins_pkey" PRIMARY KEY, btree (attachment, isin)
"attachment_isins_attachment_idx" btree (attachment)
"attachment_isins_attachment_isin" btree (attachment, isin)
"attachment_isins_isin_idx" btree (isin)
Foreign-key constraints:
"attachment_isins_attachment_fkey" FOREIGN KEY (attachment)
REFERENCES ts_frontend.attachments(id) ON UPDATE CASCADE ON DELETE CASCADE

Table "ts_frontend.rec_isins"
Column | Type | Modifiers
------------+---------+-----------
attachment | integer | not null
isin | isin | not null
Indexes:
"rec_isin_pkey" PRIMARY KEY, btree (isin, attachment)
"idx_rec_isins_attachment" btree (attachment)
Foreign-key constraints:
"rec_isins_attachment_fkey" FOREIGN KEY (attachment) REFERENCES
ts_frontend.attachments(id) ON UPDATE CASCADE ON DELETE CASCADE

Thanks for any ideas!

Regards
Christian

P.S.: I think the full outer join is not what the developer really
wanted to do. Instead, he should have done a union (which is pretty
fast, by the way). However, I still want to understand why the query
plan of his query changed between both database releases.

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax: +49 551 489500-91
Hans-Böckler-Straße 2 http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Quan Zongliang 2009-04-15 11:42:50 how to escape back-slash mark.
Previous Message Peeyush 2009-04-15 10:57:50 need information

Browse pgsql-hackers by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2009-04-15 11:43:04 Re: Performance of full outer join in 8.3
Previous Message Greg Stark 2009-04-15 10:50:08 Re: Replacing plpgsql's lexer