Should changing offset in LIMIT change query plan (at all/so early)?

From: goran <goran(dot)gugic(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Should changing offset in LIMIT change query plan (at all/so early)?
Date: 2010-11-19 12:33:43
Message-ID: 9a122873-72e9-416a-9f17-3e844ad1a11b@l8g2000yqh.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Trying to understand why query planer changes the plan from effective
one to ineffective one when I change the offset in the LIMIT. Also,
thankfully accepting RTFM pointers to the actual FMs.

Setup is: 3 tables with 0.5M to 1.5M records
While tuning indexes for the following query

SELECT c.id, c.name, c.owner
FROM catalog c, catalog_securitygroup cs, catalog_university cu
WHERE c.root < 50
AND cs.catalog = c.id
AND cu.catalog = c.id
AND cs.securitygroup < 200
AND cu.university < 200
ORDER BY c.name
LIMIT 50 OFFSET 100

I managed to bring it to ~3ms with the following plan
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Limit (cost=15141.07..22711.60 rows=50 width=59)
-> Nested Loop (cost=0.00..30052749.38 rows=198485 width=59)
-> Nested Loop (cost=0.00..705519.23 rows=147500 width=63)
-> Index Scan using test2 on catalog c
(cost=0.00..241088.93 rows=147500 width=59)
Index Cond: (root < 50)
-> Index Scan using catalog_university_pkey on
catalog_university cu (cost=0.00..3.14 rows=1 width=4)
Index Cond: ((cu.catalog = c.id) AND
(cu.university < 200))
-> Index Scan using catalog_securitygroup_pkey on
catalog_securitygroup cs (cost=0.00..196.48 rows=199 width=4)
Index Cond: ((cs.catalog = c.id) AND (cs.securitygroup
< 200))

But when I change the OFFSET in the limit to 500 it goes to ~500ms
with following plan
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=61421.34..61421.46 rows=50 width=59)
-> Sort (cost=61420.09..61916.30 rows=198485 width=59)
Sort Key: c.name
-> Merge Join (cost=45637.87..51393.33 rows=198485
width=59)
Merge Cond: (c.id = cs.catalog)
-> Merge Join (cost=48.95..440699.65 rows=147500
width=63)
Merge Cond: (c.id = cu.catalog)
-> Index Scan using catalog_pkey on catalog c
(cost=0.00..78947.35 rows=147500 width=59)
Filter: (root < 50)
-> Index Scan using catalog_university_pkey on
catalog_university cu (cost=0.00..358658.68 rows=499950 width=4)
Index Cond: (cu.university < 200)
-> Materialize (cost=45527.12..48008.19 rows=198485
width=4)
-> Sort (cost=45527.12..46023.34 rows=198485
width=4)
Sort Key: cs.catalog
-> Seq Scan on catalog_securitygroup cs
(cost=0.00..25345.76 rows=198485 width=4)
Filter: (securitygroup < 200)

Thanks for your time

Browse pgsql-performance by date

  From Date Subject
Next Message Jignesh Shah 2010-11-19 14:52:47 Re: Defaulting wal_sync_method to fdatasync on Linux for 9.1?
Previous Message Louis-David Mitterrand 2010-11-19 09:50:21 Re: best db schema for time series data?