Quick Links

Re: Query performance discontinuity

From:	Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To:	Mike Nielsen <miken(at)bigpond(dot)net(dot)au>
Cc:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Query performance discontinuity
Date:	2002-11-11 19:37:18
Message-ID:	20021111112148.S54055-100000@megazone23.bigpanda.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 12 Nov 2002, Mike Nielsen wrote:

> Just out of curiosity, anybody with any ideas on what happens to this
> query when the limit is 59626? It's as though 59626 = infinity?

> EXPLAIN
> pganalysis=> explain analyze select * from ps2 where tstart<> '2000-1-1
> 00:00:00' and time_stamp > '2000-1-1 00:00:00' order by
> tstart,time_stamp limit 59625;
> NOTICE: QUERY PLAN:
>
> Limit (cost=0.00..160331.06 rows=59625 width=179) (actual
> time=0.45..2212.19 rows=59625 loops=1)
> -> Index Scan using ps2_idx on ps2 (cost=0.00..881812.85 rows=327935
> width=179) (actual time=0.45..2140.87 rows=59626 loops=1)
> Total runtime: 2254.50 msec
>
> EXPLAIN
> pganalysis=> explain analyze select * from ps2 where tstart<> '2000-1-1
> 00:00:00' and time_stamp > '2000-1-1 00:00:00' order by
> tstart,time_stamp limit 59626;
> NOTICE: QUERY PLAN:
>
> Limit (cost=160332.32..160332.32 rows=59626 width=179) (actual
> time=37359.41..37535.85 rows=59626 loops=1)
> -> Sort (cost=160332.32..160332.32 rows=327935 width=179) (actual
> time=37359.40..37471.07 rows=59627 loops=1)
> -> Seq Scan on ps2 (cost=0.00..13783.52 rows=327935 width=179)
> (actual time=0.26..12433.00 rows=327960 loops=1)
> Total runtime: 38477.39 msec

This is apparently the breakpoint at which the sequence scan/sort/limit
max cost seems to become lower than indexscan/limit given the small
difference in estimated costs. What do you get with limit 59626 and
enable_seqscan=off? My guess is that it's just above the 160332.32
estimated here.
I believe that the query is using the index to avoid a sort, but
possibly/probably not to do the condition. I'd wonder if analyzing with
more buckets might get it a better idea, but I really don't know.
Another option is to see what making an index on (time_stamp, tstart)
gives you, but if most of the table meets the time_stamp condition,
that wouldn't help any.

In response to

Query performance discontinuity at 2002-11-11 17:44:44 from Mike Nielsen

Responses

Re: Query performance discontinuity at 2002-11-12 01:39:01 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2002-11-12 01:39:01	Re: Query performance discontinuity
Previous Message	Josh Berkus	2002-11-11 19:14:49	Re: Query performance discontinuity