9.6 Beta 2 Performance Regression on Recursive CTE

From: "Nicholson, Brad (Toronto, ON, CA)" <bnicholson(at)hpe(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: 9.6 Beta 2 Performance Regression on Recursive CTE
Date: 2016-07-04 20:03:40
Message-ID: CS1PR84MB0214C7D911E405CB83EE188AD7380@CS1PR84MB0214.NAMPRD84.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I'm seeing a performance regression on 9.6 Beta 2 compared to 9.5.3. The query is question is a recursive query on graph data stored as an adjacency list. While this is example is fairly contrived, it mimics the behavior I am seeing on real data with more realistic queries. The example below uses the same data set, same DB configuration, same query plan and same host. But the 9.6 beta 2 query runs in 26 seconds and 9.5.3 runs in 13.4 seconds. This pattern holds up on repeated runs.

This seems to be related to the citext data type. If I re-run the test below with varchar as the data type, performance is similar between 9.5 and 9.6.

Both are running on CentOS 6.6 using the PG community RPMs.

Version details:

version
----------------------------------------------------------------------------------------------------------
PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit


version
-------------------------------------------------------------------------------------------------------------
PostgreSQL 9.6beta2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit

Relevant config options (same on both)

cpu_index_tuple_cost=0.005
cpu_operator_cost=0.0025
cpu_tuple_cost=0.05
effective_cache_size=4GB
random_page_cost=4
seq_page_cost=1
shared_buffers=2GB
temp_buffers=8MB
work_mem=4MB

Very basic table structure:

\d adj_edge_citext
Table "public.adj_edge_citext"
Column | Type | Modifiers
--------+--------+-----------
parent | citext |
child | citext |

Import the Graph of Thrones data set from: https://www.macalester.edu/~abeverid/data/stormofswords.csv (ignore the weight column).

Run VACUUM ANALYZE on the table, and then run the following:

EXPLAIN (analyze, buffers, verbose) WITH RECURSIVE d AS (
SELECT
a.parent,
a.child,
1 AS LEVEL,
ARRAY[a.parent] AS path,
FALSE AS CYCLE
FROM
adj_edge_citext a
WHERE a.parent in (SELECT distinct(parent) FROM adj_edge_citext)
UNION ALL
SELECT
a1.parent,
a1.child,
d.LEVEL+1,
path || a1.parent,
a1.parent=ANY(path)
FROM
adj_edge_citext a1
JOIN d ON d.child=a1.parent
WHERE
NOT CYCLE
)
SELECT
parent,
child,
level,
path
FROM d;


9.5.3:
https://explain.depesz.com/s/HOO7
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on d (cost=9766.98..12900.18 rows=31332 width=100) (actual time=0.810..13023.860 rows=1427997 loops=1)
Output: d.parent, d.child, d.level, d.path
Buffers: shared hit=6, temp read=31714 written=64240
CTE d
-> Recursive Union (cost=31.31..9766.98 rows=31332 width=48) (actual time=0.805..9613.617 rows=1427997 loops=1)
Buffers: shared hit=6, temp read=31714 written=31701
-> Hash Semi Join (cost=31.31..69.83 rows=352 width=12) (actual time=0.803..1.879 rows=352 loops=1)
Output: a.parent, a.child, 1, ARRAY[a.parent], false
Hash Cond: (a.parent = adj_edge_citext.parent)
Buffers: shared hit=4
-> Seq Scan on public.adj_edge_citext a (cost=0.00..19.60 rows=352 width=12) (actual time=0.021..0.177 rows=352 loops=1)
Output: a.parent, a.child
Buffers: shared hit=2
-> Hash (cost=27.58..27.58 rows=71 width=6) (actual time=0.760..0.760 rows=71 loops=1)
Output: adj_edge_citext.parent
Buckets: 1024 Batches: 1 Memory Usage: 11kB
Buffers: shared hit=2
-> HashAggregate (cost=20.48..24.03 rows=71 width=6) (actual time=0.638..0.681 rows=71 loops=1)
Output: adj_edge_citext.parent
Group Key: adj_edge_citext.parent
Buffers: shared hit=2
-> Seq Scan on public.adj_edge_citext (cost=0.00..19.60 rows=352 width=6) (actual time=0.010..0.168 rows=352 loops=1)
Output: adj_edge_citext.parent
Buffers: shared hit=2
-> Hash Join (cost=38.08..656.40 rows=3098 width=48) (actual time=0.054..303.806 rows=62072 loops=23)
Output: a1.parent, a1.child, (d_1.level + 1), (d_1.path || a1.parent), (a1.parent = ANY (d_1.path))
Hash Cond: (d_1.child = a1.parent)
Buffers: shared hit=2, temp read=31714 written=13
-> WorkTable Scan on d d_1 (cost=0.00..352.00 rows=1760 width=68) (actual time=0.013..35.953 rows=62087 loops=23)
Output: d_1.parent, d_1.child, d_1.level, d_1.path, d_1.cycle
Filter: (NOT d_1.cycle)
Buffers: temp read=31714 written=13
-> Hash (cost=19.60..19.60 rows=352 width=12) (actual time=0.496..0.496 rows=352 loops=1)
Output: a1.parent, a1.child
Buckets: 1024 Batches: 1 Memory Usage: 24kB
Buffers: shared hit=2
-> Seq Scan on public.adj_edge_citext a1 (cost=0.00..19.60 rows=352 width=12) (actual time=0.009..0.204 rows=352 loops=1)
Output: a1.parent, a1.child
Buffers: shared hit=2
Planning time: 0.543 ms
Execution time: 13415.280 ms

9.6 Beta 2:
https://explain.depesz.com/s/R7oR
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on d (cost=9766.98..12900.18 rows=31332 width=100) (actual time=1.737..25688.068 rows=1427997 loops=1)
Output: d.parent, d.child, d.level, d.path
Buffers: shared hit=6, temp read=31714 written=64240
CTE d
-> Recursive Union (cost=31.31..9766.98 rows=31332 width=49) (actual time=1.731..22178.446 rows=1427997 loops=1)
Buffers: shared hit=6, temp read=31714 written=31701
-> Hash Semi Join (cost=31.31..69.83 rows=352 width=49) (actual time=1.728..3.862 rows=352 loops=1)
Output: a.parent, a.child, 1, ARRAY[a.parent], false
Hash Cond: (a.parent = adj_edge_citext.parent)
Buffers: shared hit=4
-> Seq Scan on index.adj_edge_citext a (cost=0.00..19.60 rows=352 width=12) (actual time=0.023..0.183 rows=352 loops=1)
Output: a.parent, a.child
Buffers: shared hit=2
-> Hash (cost=27.58..27.58 rows=71 width=6) (actual time=1.678..1.678 rows=71 loops=1)
Output: adj_edge_citext.parent
Buckets: 1024 Batches: 1 Memory Usage: 11kB
Buffers: shared hit=2
-> HashAggregate (cost=20.48..24.03 rows=71 width=6) (actual time=1.485..1.527 rows=71 loops=1)
Output: adj_edge_citext.parent
Group Key: adj_edge_citext.parent
Buffers: shared hit=2
-> Seq Scan on index.adj_edge_citext (cost=0.00..19.60 rows=352 width=6) (actual time=0.010..0.177 rows=352 loops=1)
Output: adj_edge_citext.parent, adj_edge_citext.child
Buffers: shared hit=2
-> Hash Join (cost=38.08..656.40 rows=3098 width=49) (actual time=0.095..845.508 rows=62072 loops=23)
Output: a1.parent, a1.child, (d_1.level + 1), (d_1.path || a1.parent), (a1.parent = ANY (d_1.path))
Hash Cond: (d_1.child = a1.parent)
Buffers: shared hit=2, temp read=31714 written=13
-> WorkTable Scan on d d_1 (cost=0.00..352.00 rows=1760 width=68) (actual time=0.021..41.653 rows=62087 loops=23)
Output: d_1.parent, d_1.child, d_1.level, d_1.path, d_1.cycle
Filter: (NOT d_1.cycle)
Buffers: temp read=31714 written=13
-> Hash (cost=19.60..19.60 rows=352 width=12) (actual time=0.757..0.757 rows=352 loops=1)
Output: a1.parent, a1.child
Buckets: 1024 Batches: 1 Memory Usage: 24kB
Buffers: shared hit=2
-> Seq Scan on index.adj_edge_citext a1 (cost=0.00..19.60 rows=352 width=12) (actual time=0.012..0.166 rows=352 loops=1)
Output: a1.parent, a1.child
Buffers: shared hit=2
Planning time: 0.785 ms
Execution time: 26062.927 ms

As an aside, the cost mis-estimates are very common with these types of queries on graph data, and frequently result in the planner making bad choices which result in queries taking seconds instead of milliseconds. I am aware of the whole CTE optimization fence issue, but it is a pretty big problem for graph data in Postgres.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Martín Marqués 2016-07-04 23:37:34 Re: Unable to recovery due missing wal_file
Previous Message Craig Boyd 2016-07-04 16:15:58 Re: psql connection option: statement_timeout