Skip site navigation (1) Skip section navigation (2)

The plan changes when the limit gets above ~850,000

From: "Wm(dot)A(dot)Stafford" <stafford(at)marine(dot)rutgers(dot)edu>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: The plan changes when the limit gets above ~850,000
Date: 2008-10-07 15:48:05
Message-ID: 48EB84B5.7010305@marine.rutgers.edu (view raw or flat)
Thread:
Lists: pgsql-jdbc
First of all, thanks to Albe Laurenz for pointing out how to analyze the 
situation where there was a change in behavior when the limit was above 
about 850,000.

I looked at the plan for both versions of the query and, as Albe 
suggested, the plans were different.  I don't know how to analyze the 
plans but I'm guessing that when the number of records returned gets 
larger setup time is an increasingly smaller part of the entire process 
so more time can be spent on setup.  This would result in the apparent 
inactivity of the application running the query when limit went from 
850,000 to 1,000,000

I think I would like the query to work the same regardless of the number 
of records returned but I'm not sure: 1. if that can be done, 2. it 
would be substantially slower or 3. if there is any choice in the matter 
at all.

The two plans are below.  Any help or advice would be appreciated.

-=bill

the query:
SELECT A.cache_id, A.validname, B.resource_full_name, B.resource_id , 
C.worms_id
FROM cache A, resources B, taxa C
WHERE A.resource_id=B.resource_id AND A.taxon_id=C.taxon_id
OFFSET 0 LIMIT 800000

the plan when limit = 800000
"Limit  (cost=0.00..5868507.75 rows=800000 width=614)"
"  ->  Nested Loop  (cost=0.00..104095524.49 rows=14190391 width=614)"
"        ->  Merge Join  (cost=0.00..2615512.71 rows=14190391 width=94)"
"              Merge Cond: (c.taxon_id = a.taxon_id)"
"              ->  Index Scan using ix_taxon_id on taxa c  
(cost=0.00..2020343.76 rows=519932 width=12)"
"              ->  Index Scan using ix_taxon_id2 on "cache" a  
(cost=0.00..819727.67 rows=14190391 width=94)"
"        ->  Index Scan using obis_resources_pkey on obis_resources b  
(cost=0.00..7.12 rows=1 width=524)"
"              Index Cond: (a.resource_id = b.resource_id)"

the plan when limit = 1000000
"Limit  (cost=5941567.20..5974067.35 rows=1000000 width=614)"
"  ->  Merge Join  (cost=5941567.20..6402757.08 rows=14190391 width=614)"
"        Merge Cond: (b.resource_id = a.resource_id)"
"        ->  Sort  (cost=105.54..106.63 rows=437 width=524)"
"              Sort Key: b.resource_id"
"              ->  Seq Scan on obis_resources b  (cost=0.00..86.37 
rows=437 width=524)"
"        ->  Sort  (cost=5941461.66..5976937.64 rows=14190391 width=94)"
"              Sort Key: a.resource_id"
"              ->  Merge Join  (cost=97698.57..1151605.96 rows=14190391 
width=94)"
"                    Merge Cond: (a.taxon_id = c.taxon_id)"
"                    ->  Index Scan using ix_taxon_id2 on "cache" a  
(cost=0.00..819727.67 rows=14190391 width=94)"
"                    ->  Sort  (cost=97698.57..98998.40 rows=519932 
width=12)"
"                          Sort Key: c.taxon_id"
"                          ->  Seq Scan on taxa c  (cost=0.00..27007.32 
rows=519932 width=12)"


Responses

pgsql-jdbc by date

Next:From: Tom LaneDate: 2008-10-07 16:13:19
Subject: Re: The plan changes when the limit gets above ~850,000
Previous:From: Tilman RassyDate: 2008-10-07 14:57:10
Subject: Re: Aggregate function: Different results with jdbc and psql

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group