Performance problem on 2 PG versions on same query

From: Rémy-Christophe Schermesser <rcs(at)netcosports(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: Bertrand Paquet <bertrand(at)netcosports(dot)com>
Subject: Performance problem on 2 PG versions on same query
Date: 2014-11-05 16:31:28
Message-ID: CAJDhA5z0rq35BqX+LK-zcByA0mxNFi95s2aPg5=L8+B7aO8yCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

We have 2 instances of PG, one in 9.1.1 and 9.1.14. They have the same
data, schema, PG configuration, and are almost identical machines, same
number of cores and memory, but different cloud provider. The data was
transferred with a pg_dump/pg_restore. We ran VACUUM ANALYSE, ANALYSE, and
REINDEX on both machines.

One query take ~11 seconds on 9.1.1 and ~25 minutes on 9.1.14.

Can you explain why there is such a big difference, and how to fix it on
9.1.14.
On the explain plans (below), there is two difference that we can not
explain :

- Using of Hash Anti Join in 9.1.1, and Merge Anti Join in 9.1.14
- The usage of the index andalertsmatch_a_mid_idx in 9.1.14 and no index
on 9.1.1

Thanks for your help,

*Here is the query : *
SELECT a.* , e.id AS eid, e.fk_compc_id AS ecid
FROM gsm.andalertsteam a
JOIN gsm.events e ON (a.fk_team_id = e.part_a1_id OR e.part_b1_id =
a.fk_team_id)
LEFT OUTER JOIN gsm.andalertsmatch am ON e.id = am.a_mid AND am.a_uid =
a.t_uid AND am.a_bundle = a.t_bundle
WHERE (e.eventday = 1415232000 OR e.eventday = 1415059200 OR e.eventday =
1415145600)
AND e.fk_status_id != 2
AND am.a_mid IS NULL AND a.fk_comp_id IS NULL
LIMIT 1000;

*Here is the explain plan on 9.1.1 :*
Limit (cost=100414.92..107502.31 rows=1000 width=279) (actual
time=6200.302..11650.567 rows=1000 loops=1)
-> Hash Anti Join (cost=100414.92..118856.31 rows=2602 width=279) (actual
time=6200.302..11650.567 rows=1000 loops=1)
Hash Cond: ((e.id = am.a_mid) AND ((a.t_uid)::text =
(am.a_uid)::text) AND ((a.t_bundle)::text = (am.a_bundle)::text))
-> Nested Loop (cost=88.78..7785.80 rows=2655 width=279) (actual
time=190.009..9470.460 rows=7951 loops=1)
-> Seq Scan on events e (cost=0.00..1650.14 rows=2 width=32)
(actual time=40.002..260.013 rows=8 loops=1)
Filter: ((fk_status_id <> 2) AND ((eventday = 1415232000)
OR (eventday = 1415059200) OR (eventday = 141514560
0)))
-> Bitmap Heap Scan on andalertsteam a (cost=88.78..3052.59
rows=1016 width=263) (actual time=178.759..1151.306 r
ows=994 loops=8)
Recheck Cond: ((fk_team_id = e.part_a1_id) OR
(e.part_b1_id = fk_team_id))
Filter: (fk_comp_id IS NULL)
-> BitmapOr (cost=88.78..88.78 rows=1016 width=0)
(actual time=175.008..175.008 rows=0 loops=8)
-> Bitmap Index Scan on
andalertsmatch_fk_team_id_idx (cost=0.00..44.26 rows=508 width=0) (actual
tim
e=37.502..37.502 rows=555 loops=8)
Index Cond: (fk_team_id = e.part_a1_id)
-> Bitmap Index Scan on
andalertsmatch_fk_team_id_idx (cost=0.00..44.26 rows=508 width=0) (actual
tim
e=137.507..137.507 rows=4192 loops=8)
Index Cond: (e.part_b1_id = fk_team_id)
-> Hash (cost=71287.87..71287.87 rows=1064987 width=52) (actual
time=2150.105..2150.105 rows=1064987 loops=1)
Buckets: 65536 Batches: 2 Memory Usage: 44488kB
-> Seq Scan on andalertsmatch am (cost=0.00..71287.87
rows=1064987 width=52) (actual time=0.000..1680.077 rows=10
64987 loops=1)

*Here is the explain plan on 9.1.14 :*
Limit (cost=20.64..73294.62 rows=1000 width=279) (actual
time=1419311.904..1419400.785 rows=3 loops=1)
-> Merge Anti Join (cost=20.64..193463.96 rows=2640 width=279) (actual
time=1419311.902..1419400.783 rows=3 loops=1)
Merge Cond: (e.id = am.a_mid)
Join Filter: (((am.a_uid)::text = (a.t_uid)::text) AND
((am.a_bundle)::text = (a.t_bundle)::text))
-> Nested Loop (cost=20.64..8045.28 rows=2694 width=279) (actual
time=13.230..555.366 rows=121063 loops=1)
-> Index Scan using events_pkey on events e
(cost=0.00..3369.07 rows=2 width=32) (actual time=12.494..14.003 rows=41
loops=1)
Filter: ((fk_status_id <> 2) AND ((eventday =
1415232000) OR (eventday = 1415059200) OR (eventday = 1415145600)))
-> Bitmap Heap Scan on andalertsteam a
(cost=20.64..2322.43 rows=1045 width=263) (actual time=0.467..8.612
rows=2953 loops=41)
Recheck Cond: ((fk_team_id = e.part_a1_id) OR
(e.part_b1_id = fk_team_id))
Filter: (fk_comp_id IS NULL)
-> BitmapOr (cost=20.64..20.64 rows=1045 width=0)
(actual time=0.335..0.335 rows=0 loops=41)
-> Bitmap Index Scan on
andalertsmatch_fk_team_id_idx (cost=0.00..10.19 rows=522 width=0) (actual
time=0.079..0.079 rows=401 loops=41)
Index Cond: (fk_team_id = e.part_a1_id)
-> Bitmap Index Scan on
andalertsmatch_fk_team_id_idx (cost=0.00..10.19 rows=522 width=0) (actual
time=0.253..0.253 rows=2558 loops=41)
Index Cond: (e.part_b1_id = fk_team_id)
-> Index Scan using andalertsmatch_a_mid_idx on andalertsmatch am
(cost=0.00..180798.61 rows=1173762 width=52) (actual
time=0.015..875294.427 rows=1826118122 loops=1)
Total runtime: 1419400.861 ms

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tomas Vondra 2014-11-05 16:59:18 Re: Performance problem on 2 PG versions on same query
Previous Message Bernd Helmle 2014-11-04 23:14:42 Re: [GENERAL] Re: [BUGS] COPY TO returning empty result with parallel ALTER TABLE