Re: Using "LIMIT" is much faster even though, searching

From: "Hyun-Sung, Jang" <siche(at)siche(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Using "LIMIT" is much faster even though, searching
Date: 2004-12-01 06:03:31
Message-ID: 41AD5EB3.80201@siche.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

do you need all of verbose information??
VACUUM FULL ANALYZE VERBOSE give me a lot of infomation,
so i just cut zipcode parts.

==start===============================================================================
INFO: vacuuming "public.zipcode"
INFO: "zipcode": found 0 removable, 47705 nonremovable row versions in
572 pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 76 to 136 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 27944 bytes.
0 pages are or will become empty, including 0 at the end of the table.
91 pages containing 8924 free bytes are potential move destinations.
CPU 0.03s/0.00u sec elapsed 0.03 sec.
INFO: index "zipcode_pkey" now contains 47705 row versions in 147 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.01s/0.00u sec elapsed 0.00 sec.
INFO: "zipcode": moved 0 row versions, truncated 572 to 572 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.zipcode"
INFO: "zipcode": scanned 572 of 572 pages, containing 47705 live rows
and 0 dead rows; 3000 rows in sample, 47705 estimated total rows
INFO: free space map: 108 relations, 128 pages stored; 1760 total pages
needed
DETAIL: Allocated FSM size: 1000 relations + 20000 pages = 182 kB
shared memory.
VACUUM
pgsql=#
==end===============================================================================

USING INDEX SCAN

==start===============================================================================
pgsql=# EXPLAIN ANALYZE select * from zipcode where seq='1';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using zipcode_pkey on zipcode (cost=0.00..3.02 rows=1
width=55) (actual time=0.054..0.058 rows=1 loops=1)
Index Cond: (seq = 1)
Total runtime: 0.152 ms
(3 rows)

pgsql=#

pgsql=# EXPLAIN ANALYZE select * from zipcode where seq='1' LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..3.02 rows=1 width=55) (actual time=0.059..0.060
rows=1 loops=1)
-> Index Scan using zipcode_pkey on zipcode (cost=0.00..3.02 rows=1
width=55) (actual time=0.054..0.054 rows=1 loops=1)
Index Cond: (seq = 1)
Total runtime: 0.158 ms
(4 rows)

pgsql=#

WHEN SELECT LAST ROW -----

pgsql=# EXPLAIN ANALYZE select * from zipcode where seq='47705';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using zipcode_pkey on zipcode (cost=0.00..3.02 rows=1
width=55) (actual time=0.054..0.059 rows=1 loops=1)
Index Cond: (seq = 47705)
Total runtime: 0.150 ms
(3 rows)

pgsql=#

pgsql=# EXPLAIN ANALYZE select * from zipcode where seq='47705' LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..3.02 rows=1 width=55) (actual time=0.057..0.057
rows=1 loops=1)
-> Index Scan using zipcode_pkey on zipcode (cost=0.00..3.02 rows=1
width=55) (actual time=0.052..0.052 rows=1 loops=1)
Index Cond: (seq = 47705)
Total runtime: 0.156 ms
(4 rows)

pgsql=#
==end===============================================================================

USING SEQUENCE SCAN

==start===============================================================================
pgsql=# set enable_indexscan to off;
SET
pgsql=#

pgsql=# EXPLAIN ANALYZE select * from zipcode where seq='1';
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on zipcode (cost=0.00..1168.31 rows=1 width=55) (actual
time=0.032..109.934 rows=1 loops=1)
Filter: (seq = 1)
Total runtime: 110.021 ms
(3 rows)

pgsql=#

pgsql=# EXPLAIN ANALYZE select * from zipcode where seq='1' LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Limit (cost=0.00..1168.31 rows=1 width=55) (actual time=0.035..0.035
rows=1 loops=1)
-> Seq Scan on zipcode (cost=0.00..1168.31 rows=1 width=55) (actual
time=0.030..0.030 rows=1 loops=1)
Filter: (seq = 1)
Total runtime: 0.113 ms
(4 rows)

pgsql=#

WHEN SELECT LAST ROW -----

pgsql=# EXPLAIN ANALYZE select * from zipcode where seq='47705';
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on zipcode (cost=0.00..1168.31 rows=1 width=55) (actual
time=4.048..110.232 rows=1 loops=1)
Filter: (seq = 47705)
Total runtime: 110.322 ms
(3 rows)

pgsql=#

pgsql=# EXPLAIN ANALYZE select * from zipcode where seq='47705' LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Limit (cost=0.00..1168.31 rows=1 width=55) (actual time=4.038..4.038
rows=1 loops=1)
-> Seq Scan on zipcode (cost=0.00..1168.31 rows=1 width=55) (actual
time=4.033..4.033 rows=1 loops=1)
Filter: (seq = 47705)
Total runtime: 4.125 ms
(4 rows)

pgsql=#

==end===============================================================================

I just choose zipcode table for this test.
not only zipcode table but other table also give me same result.

SELECT * FROM table_name WHERE PK = 'xxx'

was always slower than

SELECT * FROM table_name WHERE PK = 'xxx' LIMIT 1

when sequence scan .

i think pgsql tring to find more than 1 row when query executed even if
searching condition is primary key.

ah, why i'm using sequence as PK instead of zip code is
in korea, the small towns doesn't have it's own zipcode
so they share other big city's.
that's why zip code can't be a primary key.
actually, i'm not using sequence to find zipcode.
i made it temporary for this test.

i think there is nobody want to using sequence number to find zipcode,
instead of city name. :-)

Josh Berkus 쓴 글:

>Hyun-Sang,
>
>
>
>>before test, I already executed VACUUM FULL.
>>this result show up after vacuum full.
>>
>>
>
>Really? Your results really look like a bloated table. Can you run VACUUM
>FULL ANALYZE VERBOSE on the table and post the output?
>
>
>
>>When I using index scan, the result was almost same, that means, there
>>was no time difference, so i'll not mention about index scan.
>>
>>
>
>Can we see an index scan plan anyway? EXPLAIN ANALYZE?
>
>Oh, and if this is a zip codes table, why are you using a sequence as the
>primary key instead of just using the zip code?
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andrew McMillan 2004-12-01 08:23:30 Re: Using "LIMIT" is much faster even though, searching
Previous Message Josh Berkus 2004-12-01 05:03:51 Re: Using "LIMIT" is much faster even though, searching