Slow query with backwards index scan

From: Tilmann Singer <tils-pgsql(at)tils(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Slow query with backwards index scan
Date: 2007-07-27 17:27:03
Message-ID: 20070727172703.GG25252@tils.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Dear list,

I am having problems selecting the 10 most recent rows from a large
table (4.5M rows), sorted by a date column of that table. The large
table has a column user_id which either should match a given user_id,
or should match the column contact_id in a correlated table where the
user_id of that correlated table must match the given user_id.

The user_id values in the large table are distributed in such a way
that in the majority of cases for a given user_id 10 matching rows can
be found very early when looking at the table sorted by the date -
propably within the first 1%. Sometimes the given user_id however
would match any rows only very far towards the end of the sorted large
table, or not at all.

The query works fine for the common cases when matching rows are found
early in the sorted large table, like this:

testdb=# EXPLAIN ANALYZE
SELECT * FROM large_table lt
LEFT JOIN relationships r ON lt.user_id=r.contact_id
WHERE r.user_id = 55555 OR lt.user_id = 55555
ORDER BY lt.created_at DESC LIMIT 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..33809.31 rows=10 width=646) (actual time=0.088..69.448 rows=10 loops=1)
-> Nested Loop Left Join (cost=0.00..156983372.66 rows=46432 width=646) (actual time=0.082..69.393 rows=10 loops=1)
Filter: ((r.user_id = 55555) OR (lt.user_id = 55555))
-> Index Scan Backward using large_created_at_index on large_table lt (cost=0.00..914814.94 rows=4382838 width=622) (actual time=0.028..0.067 rows=13 loops=1)
-> Index Scan using relationships_contact_id_index on relationships r (cost=0.00..35.33 rows=16 width=24) (actual time=0.017..2.722 rows=775 loops=13)
Index Cond: (lt.user_id = r.contact_id)
Total runtime: 69.640 ms

but for the following user_id there are 3M rows in the large table
which are more recent then the 10th matching one. The query then does
not perform so well:

testdb=# EXPLAIN ANALYZE
SELECT * FROM large_table lt
LEFT JOIN relationships r ON lt.user_id=r.contact_id
WHERE r.user_id = 12345 OR lt.user_id = 12345
ORDER BY lt.created_at DESC LIMIT 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..33809.31 rows=10 width=646) (actual time=203454.353..425978.718 rows=10 loops=1)
-> Nested Loop Left Join (cost=0.00..156983372.66 rows=46432 width=646) (actual time=203454.347..425978.662 rows=10 loops=1)
Filter: ((r.user_id = 12345) OR (lt.user_id = 12345))
-> Index Scan Backward using large_created_at_index on large_table lt (cost=0.00..914814.94 rows=4382838 width=622) (actual time=0.031..78386.769 rows=3017547 loops=1)
-> Index Scan using relationships_contact_id_index on relationships r (cost=0.00..35.33 rows=16 width=24) (actual time=0.006..0.060 rows=18 loops=3017547)
Index Cond: (lt.user_id = r.contact_id)
Total runtime: 425978.903 ms

When split it up into the two following queries it performs much
better for that user_id. Since the results of the two could be
combined into the desired result, I'm assuming it could also be done
efficiently within one query, if only a better plan would be used.

testdb=# EXPLAIN ANALYZE
SELECT * FROM large_table lt
WHERE lt.user_id = 12345
ORDER BY lt.created_at DESC LIMIT 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..33.57 rows=10 width=622) (actual time=64.030..71.720 rows=10 loops=1)
-> Index Scan Backward using large_user_id_created_at_index on large_table lt (cost=0.00..7243.59 rows=2158 width=622) (actual time=64.023..71.662 rows=10 loops=1)
Index Cond: (user_id = 12345)
Total runtime: 71.826 ms

testdb=# EXPLAIN ANALYZE
SELECT * FROM large_table lt
WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
ORDER BY created_at DESC LIMIT 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=6902.52..6902.54 rows=10 width=622) (actual time=0.232..0.262 rows=4 loops=1)
-> Sort (cost=6902.52..6905.57 rows=1220 width=622) (actual time=0.225..0.237 rows=4 loops=1)
Sort Key: lt.created_at
-> Nested Loop (cost=42.78..6839.98 rows=1220 width=622) (actual time=0.090..0.185 rows=4 loops=1)
-> HashAggregate (cost=42.78..42.79 rows=1 width=4) (actual time=0.059..0.062 rows=1 loops=1)
-> Bitmap Heap Scan on relationships (cost=4.34..42.75 rows=11 width=4) (actual time=0.038..0.041 rows=1 loops=1)
Recheck Cond: (user_id = 12345)
-> Bitmap Index Scan on relationships_user_id_index (cost=0.00..4.34 rows=11 width=0) (actual time=0.027..0.027 rows=1 loops=1)
Index Cond: (user_id = 12345)
-> Index Scan using large_user_id_started_at_index on large_table lt (cost=0.00..6768.48 rows=2297 width=622) (actual time=0.020..0.087 rows=4 loops=1)
Index Cond: (lt.user_id = relationships.contact_id)
Total runtime: 0.439 ms

I'm not very experienced reading query plans and don't know how to go
about this from here - is it theoretically possible to have a query
that performs well with the given data in both cases or is there a
conceptual problem?

The database was freshly imported and ANALYZEd before running the
above tests.

I also tried the following for every involved column: increase
statistics target, analyze the table, explain analyze the slow query,
but the plan never changed.

The relevant schema and indices portions are:

testdb=# \d large_table
Table "public.large_table"
Column | Type | Modifiers
-------------+-----------------------------+--------------------------------------------------------
id | integer | not null default nextval('large_id_seq'::regclass)
user_id | integer | not null
created_at | timestamp without time zone |
Indexes:
"large_pkey" PRIMARY KEY, btree (id)
"large_created_at_index" btree (created_at)
"large_user_id_created_at_index" btree (user_id, created_at)

testdb=# \d relationships
Table "public.relationships"
Column | Type | Modifiers
------------+-----------------------------+------------------------------------------------------------
id | integer | not null default nextval('relationships_id_seq'::regclass)
user_id | integer |
contact_id | integer |
Indexes:
"relationships_pkey" PRIMARY KEY, btree (id)
"relationships_contact_id_index" btree (contact_id)
"relationships_user_id_index" btree (user_id, contact_id)

testdb=# select tablename, attname, null_frac, avg_width, n_distinct, correlation from pg_stats where tablename in ('large_table', 'relationships');
tablename | attname | null_frac | avg_width | n_distinct | correlation
-----------------+-------------+------------+-----------+------------+-------------
relationships | id | 0 | 4 | -1 | 0.252015
relationships | user_id | 0 | 4 | 3872 | 0.12848
relationships | contact_id | 0 | 4 | 3592 | 0.6099
large_table | id | 0 | 4 | -1 | 0.980048
large_table | user_id | 0 | 4 | 1908 | 0.527973
large_table | created_at | 0 | 8 | 87636 | 0.973318

select version();
version
-----------------------------------------------------------------------------------------------
PostgreSQL 8.2.4 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4)
(1 row)

grateful for any advice, Til

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2007-07-27 18:27:00 Re: How to use a trigger to write rows to a remote server
Previous Message Merlin Moncure 2007-07-27 13:44:31 Re: disk filling up