Adding LEFT JOIN to a query has increased execution time 10 times

From: Alexander Farber <alexander(dot)farber(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Adding LEFT JOIN to a query has increased execution time 10 times
Date: 2019-01-05 16:43:30
Message-ID: CAADeyWhxj4Pmwaa0EcjkMRPOydZTYOTyESqzPoFfe-Xj4mri0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Good evening,

On a CentOS 7.6 server (Intel Core i7-6700, 64 GB DDR4 RAM, RAID1 SSD) I
run a backend written in PL/pgSQL and Java for a mobile and desktop word
game with the following Linux packages:

postgresql10-server-10.6-1PGDG.rhel7.x86_64
pgbouncer-1.9.0-1.rhel7.x86_64
postgresql-jdbc-42.2.5-1.rhel7.noarch

Here are the only modified settings in postgresql.conf:

max_connections = 120 # (change requires restart)
work_mem = 8MB # min 64kB
maintenance_work_mem = 128MB # min 1MB

90% of the backend source code are JSON-emitting stored functions and there
is one function which is the main core of the game and is a SELECT query
over 7 tables.

It is called for every Websocket-connected client and delivers a JSON list
of active games for the player.

Until recently the query needed 1-2 seconds for completion, but after I
have added a LEFT JOIN with the following table, the query takes 7-10
seconds for completion and makes the game unpleasant to play:

# \d words_geoip;
Table "public.words_geoip"
Column | Type | Collation | Nullable | Default
--------+------------------+-----------+----------+---------
block | inet | | not null |
lat | double precision | | |
lng | double precision | | |
Indexes:
"words_geoip_pkey" PRIMARY KEY, btree (block)

# select * from words_geoip order by random() limit 5;
block | lat | lng
-------------------+----------+-----------
217.72.221.128/25 | 48.26 | 11.434
71.183.37.0/24 | 40.9357 | -72.9809
190.174.132.0/22 | -34.6033 | -58.3817
24.72.74.128/25 | 50.5061 | -104.6752
73.155.238.0/23 | 29.5075 | -95.0895
(5 rows)

# select count(*) from words_geoip;
count
---------
3073410
(1 row)

Here is the SELECT query (I have removed the stored function and
ROW_TO_JSON around it for better readability and have commented the 3 new
lines out):

SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int AS created,
EXTRACT(EPOCH FROM g.finished)::int AS finished,
g.letters AS letters,
g.values AS values,
g.bid AS bid,
CARDINALITY(g.pile) AS pilelen,
m.tiles AS tiles,
m.score AS score,
CASE WHEN g.player1 = 5 THEN g.player1 ELSE g.player2 END AS
player1,
CASE WHEN g.player1 = 5 THEN g.player2 ELSE g.player1 END AS
player2,
CASE WHEN g.player1 = 5 THEN g.score1 ELSE g.score2 END AS score1,
CASE WHEN g.player1 = 5 THEN g.score2 ELSE g.score1 END AS score2,
CASE WHEN g.player1 = 5 THEN g.state1 ELSE g.state2 END AS state1,
CASE WHEN g.player1 = 5 THEN g.hint1 ELSE g.hint2 END AS hint1,
CASE WHEN g.player1 = 5 THEN g.chat1 ELSE g.chat2 END AS chat1,
u1.elo AS elo1,
u2.elo AS elo2,

-- i2.lat AS lat2,
-- i2.lng AS lng2,

s1.given AS given1,
s2.given AS given2,
s1.photo AS photo1,
s2.photo AS photo2,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = 5 THEN g.played1 ELSE
g.played2 END)::int AS played1,
EXTRACT(EPOCH FROM CASE WHEN g.player1 = 5 THEN g.played2 ELSE
g.played1 END)::int AS played2,
ARRAY_TO_STRING(CASE WHEN g.player1 = 5 THEN g.hand1 ELSE g.hand2
END, '') AS hand1,
CASE
WHEN g.finished IS NOT NULL THEN NULL
WHEN g.player2 IS NULL THEN NULL
WHEN g.player1 = 5 AND g.played1 < g.played2
THEN EXTRACT(EPOCH FROM g.played2 + interval '24
hour' - CURRENT_TIMESTAMP)::int
WHEN g.player2 = 5 AND (g.played2 IS NULL OR g.played2 <
g.played1)
THEN EXTRACT(EPOCH FROM g.played1 + interval '24
hour' - CURRENT_TIMESTAMP)::int
ELSE NULL
END AS left1,
CASE
WHEN g.finished IS NOT NULL THEN NULL
WHEN g.player2 IS NULL THEN NULL
WHEN g.player1 = 5 AND (g.played2 IS NULL OR g.played2 <
g.played1)
THEN EXTRACT(EPOCH FROM g.played1 + interval '24
hour' - CURRENT_TIMESTAMP)::int
WHEN g.player2 = 5 AND g.played1 < g.played2
THEN EXTRACT(EPOCH FROM g.played2 + interval '24
hour' - CURRENT_TIMESTAMP)::int
ELSE NULL
END AS left2
FROM words_games g
LEFT JOIN words_moves m ON m.gid = g.gid
AND NOT EXISTS (SELECT 1
FROM words_moves m2
WHERE m2.gid = m.gid
AND m2.played > m.played)
LEFT JOIN words_users u1 ON u1.uid = 5
LEFT JOIN words_users u2 ON u2.uid = (CASE WHEN g.player1 = 5 THEN
g.player2 ELSE g.player1 END)

-- LEFT JOIN words_geoip i2 ON (CASE WHEN g.player1 = 5 THEN u2.ip ELSE
u1.ip END) << i2.block

LEFT JOIN words_social s1 ON s1.uid = 5
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s1.uid = s.uid
AND s.stamp > s1.stamp)
LEFT JOIN words_social s2 ON s2.uid = (CASE WHEN g.player1 = 5 THEN
g.player2 ELSE g.player1 END)
AND NOT EXISTS (SELECT 1
FROM words_social s
WHERE s2.uid = s.uid
AND s.stamp > s2.stamp)
WHERE 5 IN (g.player1, g.player2)
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1
day');

Here the above query with EXPLAIN ANALYZE prepended:

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=117019.66..323100.50 rows=240 width=1414)
(actual time=390.626..1518.867 rows=9 loops=1)
Hash Cond: (CASE WHEN (g.player1 = 5) THEN g.player2 ELSE g.player1
END = s2.uid)
-> Nested Loop Left Join (cost=115853.58..321900.39 rows=240
width=1760) (actual time=384.751..1512.913 rows=9 loops=1)
-> Nested Loop Left Join (cost=115853.58..321042.49 rows=60
width=1672) (actual time=382.754..1510.864 rows=9 loops=1)
-> Nested Loop Left Join (cost=115853.29..321033.43
rows=60 width=1668) (actual time=382.749..1510.850 rows=9 loops=1)
-> Hash Right Join (cost=115853.00..320535.13
rows=60 width=1664) (actual time=382.736..1510.753 rows=9 loops=1)
Hash Cond: (m.gid = g.gid)
-> Hash Anti Join
(cost=103356.07..305406.36 rows=1002521 width=212) (actual
time=358.638..1488.564 rows=49721 loops=1)
Hash Cond: (m.gid = m2.gid)
Join Filter: (m2.played > m.played)
Rows Removed by Join Filter: 3841937
-> Seq Scan on words_moves m
(cost=0.00..77215.81 rows=1503781 width=220) (actual time=0.003..233.670
rows=1499874 loops=1)
-> Hash (cost=77215.81..77215.81
rows=1503781 width=12) (actual time=358.205..358.205 rows=1499874 loops=1)
Buckets: 262144 Batches: 16
Memory Usage: 6706kB
-> Seq Scan on words_moves m2
(cost=0.00..77215.81 rows=1503781 width=12) (actual time=0.003..204.397
rows=1499874 loops=1)
-> Hash (cost=12496.89..12496.89 rows=3
width=1456) (actual time=19.316..19.316 rows=9 loops=1)
Buckets: 1024 Batches: 1 Memory
Usage: 22kB
-> Seq Scan on words_games g
(cost=0.00..12496.89 rows=3 width=1456) (actual time=15.134..19.304 rows=9
loops=1)
Filter: (((5 = player1) OR (5 =
player2)) AND ((finished IS NULL) OR (finished > (CURRENT_TIMESTAMP - '1
day'::interval))))
Rows Removed by Filter: 49730
-> Index Scan using words_users_pkey on
words_users u2 (cost=0.29..8.30 rows=1 width=8) (actual time=0.008..0.008
rows=1 loops=9)
Index Cond: (uid = CASE WHEN (g.player1 = 5)
THEN g.player2 ELSE g.player1 END)
-> Materialize (cost=0.29..8.31 rows=1 width=4)
(actual time=0.001..0.001 rows=1 loops=9)
-> Index Scan using words_users_pkey on
words_users u1 (cost=0.29..8.30 rows=1 width=4) (actual time=0.003..0.004
rows=1 loops=1)
Index Cond: (uid = 5)
-> Materialize (cost=0.00..854.91 rows=4 width=88) (actual
time=0.222..0.227 rows=1 loops=9)
-> Nested Loop Anti Join (cost=0.00..854.89 rows=4
width=88) (actual time=1.995..2.039 rows=1 loops=1)
Join Filter: ((s.stamp > s1.stamp) AND (s1.uid =
s.uid))
Rows Removed by Join Filter: 11
-> Seq Scan on words_social s1
(cost=0.00..427.20 rows=6 width=96) (actual time=0.112..0.990 rows=6
loops=1)
Filter: (uid = 5)
Rows Removed by Filter: 10652
-> Materialize (cost=0.00..427.23 rows=6
width=8) (actual time=0.015..0.174 rows=3 loops=6)
-> Seq Scan on words_social s
(cost=0.00..427.20 rows=6 width=8) (actual time=0.092..1.043 rows=6 loops=1)
Filter: (uid = 5)
Rows Removed by Filter: 10652
-> Hash (cost=1077.29..1077.29 rows=7104 width=92) (actual
time=5.855..5.855 rows=10552 loops=1)
Buckets: 16384 (originally 8192) Batches: 1 (originally 1)
Memory Usage: 1419kB
-> Hash Anti Join (cost=533.76..1077.29 rows=7104 width=92)
(actual time=1.918..4.456 rows=10552 loops=1)
Hash Cond: (s2.uid = s_1.uid)
Join Filter: (s_1.stamp > s2.stamp)
Rows Removed by Join Filter: 10709
-> Seq Scan on words_social s2 (cost=0.00..400.56
rows=10656 width=96) (actual time=0.002..0.576 rows=10658 loops=1)
-> Hash (cost=400.56..400.56 rows=10656 width=8)
(actual time=1.906..1.906 rows=10658 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 545kB
-> Seq Scan on words_social s_1
(cost=0.00..400.56 rows=10656 width=8) (actual time=0.001..1.098 rows=10658
loops=1)
Planning time: 0.409 ms
Execution time: 1518.996 ms
(48 rows)

And here the same query, but I have removed the `--` characters, so that
LEFT JOIN against the `words_geoip` table has been added:

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=117019.66..5257508.21 rows=3688092 width=1430)
(actual time=1383.205..5279.226 rows=9 loops=1)
Hash Cond: (CASE WHEN (g.player1 = 5) THEN g.player2 ELSE g.player1
END = s2.uid)
-> Nested Loop Left Join (cost=115853.58..4733610.67 rows=3688092
width=1776) (actual time=1376.856..5272.785 rows=9 loops=1)
-> Nested Loop Left Join (cost=115853.58..4686654.61
rows=922023 width=1688) (actual time=1374.768..5270.644 rows=9 loops=1)
Join Filter: (CASE WHEN (g.player1 = 5) THEN u2.ip ELSE
u1.ip END << i2.block)
Rows Removed by Join Filter: 27660682
-> Nested Loop Left Join (cost=115853.58..321042.49
rows=60 width=1686) (actual time=618.110..1531.740 rows=9 loops=1)
-> Nested Loop Left Join
(cost=115853.29..320544.19 rows=60 width=1675) (actual
time=618.105..1531.642 rows=9 loops=1)
-> Hash Right Join
(cost=115853.00..320535.13 rows=60 width=1664) (actual
time=618.093..1531.613 rows=9 loops=1)
Hash Cond: (m.gid = g.gid)
-> Hash Anti Join
(cost=103356.07..305406.36 rows=1002521 width=212) (actual
time=363.819..1510.167 rows=49721 loops=1)
Hash Cond: (m.gid = m2.gid)
Join Filter: (m2.played >
m.played)
Rows Removed by Join Filter:
4074631
-> Seq Scan on words_moves m
(cost=0.00..77215.81 rows=1503781 width=220) (actual time=0.007..235.666
rows=1499877 loops=1)
-> Hash
(cost=77215.81..77215.81 rows=1503781 width=12) (actual
time=363.537..363.537 rows=1499877 loops=1)
Buckets: 262144 Batches:
16 Memory Usage: 6706kB
-> Seq Scan on
words_moves m2 (cost=0.00..77215.81 rows=1503781 width=12) (actual
time=0.002..204.827 rows=1499877 loops=1)
-> Hash (cost=12496.89..12496.89
rows=3 width=1456) (actual time=18.521..18.521 rows=9 loops=1)
Buckets: 1024 Batches: 1
Memory Usage: 22kB
-> Seq Scan on words_games g
(cost=0.00..12496.89 rows=3 width=1456) (actual time=12.417..18.511 rows=9
loops=1)
Filter: (((5 = player1) OR
(5 = player2)) AND ((finished IS NULL) OR (finished > (CURRENT_TIMESTAMP -
'1 day'::interval))))
Rows Removed by Filter:
49730
-> Materialize (cost=0.29..8.31 rows=1
width=11) (actual time=0.001..0.002 rows=1 loops=9)
-> Index Scan using words_users_pkey
on words_users u1 (cost=0.29..8.30 rows=1 width=11) (actual
time=0.008..0.010 rows=1 loops=1)
Index Cond: (uid = 5)
-> Index Scan using words_users_pkey on
words_users u2 (cost=0.29..8.30 rows=1 width=15) (actual time=0.008..0.008
rows=1 loops=9)
Index Cond: (uid = CASE WHEN (g.player1 = 5)
THEN g.player2 ELSE g.player1 END)
-> Materialize (cost=0.00..83684.15 rows=3073410
width=23) (actual time=0.004..228.492 rows=3073410 loops=9)
-> Seq Scan on words_geoip i2
(cost=0.00..50308.10 rows=3073410 width=23) (actual time=0.008..248.647
rows=3073410 loops=1)
-> Materialize (cost=0.00..854.91 rows=4 width=88) (actual
time=0.232..0.237 rows=1 loops=9)
-> Nested Loop Anti Join (cost=0.00..854.89 rows=4
width=88) (actual time=2.085..2.129 rows=1 loops=1)
Join Filter: ((s.stamp > s1.stamp) AND (s1.uid =
s.uid))
Rows Removed by Join Filter: 11
-> Seq Scan on words_social s1
(cost=0.00..427.20 rows=6 width=96) (actual time=0.114..1.029 rows=6
loops=1)
Filter: (uid = 5)
Rows Removed by Filter: 10652
-> Materialize (cost=0.00..427.23 rows=6
width=8) (actual time=0.015..0.183 rows=3 loops=6)
-> Seq Scan on words_social s
(cost=0.00..427.20 rows=6 width=8) (actual time=0.087..1.094 rows=6 loops=1)
Filter: (uid = 5)
Rows Removed by Filter: 10652
-> Hash (cost=1077.29..1077.29 rows=7104 width=92) (actual
time=6.326..6.326 rows=10552 loops=1)
Buckets: 16384 (originally 8192) Batches: 1 (originally 1)
Memory Usage: 1419kB
-> Hash Anti Join (cost=533.76..1077.29 rows=7104 width=92)
(actual time=1.966..4.600 rows=10552 loops=1)
Hash Cond: (s2.uid = s_1.uid)
Join Filter: (s_1.stamp > s2.stamp)
Rows Removed by Join Filter: 10709
-> Seq Scan on words_social s2 (cost=0.00..400.56
rows=10656 width=96) (actual time=0.002..0.622 rows=10658 loops=1)
-> Hash (cost=400.56..400.56 rows=10656 width=8)
(actual time=1.952..1.952 rows=10658 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 545kB
-> Seq Scan on words_social s_1
(cost=0.00..400.56 rows=10656 width=8) (actual time=0.001..1.130 rows=10658
loops=1)
Planning time: 0.511 ms
Execution time: 5292.006 ms
(53 rows)

I am probably missing something minor, but I am not experienced with tuning
SQL queries plus the whole query is quite extensive. Please help

I have also asked my question at
https://dba.stackexchange.com/questions/226412/adding-left-join-to-a-query-has-increased-execution-time-10-times

Greetings from Germany
Alex

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2019-01-05 17:35:40 Re: Dropping and creating a trigger
Previous Message Ron 2019-01-05 10:16:06 Re: Dropping and creating a trigger