Skip site navigation (1) Skip section navigation (2)

8.1 count(*) distinct: IndexScan/SeqScan

From: Pailloncy Jean-Gerard <jg(at)rilk(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: 8.1 count(*) distinct: IndexScan/SeqScan
Date: 2005-11-23 22:14:47
Message-ID: 334D9941-B5B3-4D18-8312-F85D0FB054ED@rilk.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi,

PostgreSQL 8.1 fresh install on a freshly installed OpenBSD 3.8 box.

postgres=# CREATE DATABASE test;
CREATE DATABASE
postgres=# create table test (id serial, val integer);
NOTICE:  CREATE TABLE will create implicit sequence "test_id_seq" for  
serial column "test.id"
CREATE TABLE
postgres=# create unique index testid on test (id);
CREATE INDEX
postgres=# create index testval on test (val);
CREATE INDEX
postgres=# insert into test (val) values (round(random() 
*1024*1024*1024));
INSERT 0 1

[...] insert many random values

postgres=# vaccum full verbose analyze;
postgres=# select count(1) from test;
   count
---------
2097152
(1 row)

postgres=# explain select count(*) from (select distinct on (val) *  
from test) as foo;
                                         QUERY PLAN
------------------------------------------------------------------------ 
------------------
Aggregate  (cost=66328.72..66328.73 rows=1 width=0)
    ->  Unique  (cost=0.00..40114.32 rows=2097152 width=8)
          ->  Index Scan using testval on test  (cost=0.00..34871.44  
rows=2097152 width=8)
(3 rows)

postgres=# set enable_indexscan=off;
postgres=# explain analyze select count(*) from (select distinct on  
(val) * from test) as foo;
                                                              QUERY PLAN
------------------------------------------------------------------------ 
------------------------------------------------------------
Aggregate  (cost=280438.64..280438.65 rows=1 width=0) (actual  
time=39604.107..39604.108 rows=1 loops=1)
    ->  Unique  (cost=243738.48..254224.24 rows=2097152 width=8)  
(actual time=30281.004..37746.488 rows=2095104 loops=1)
          ->  Sort  (cost=243738.48..248981.36 rows=2097152 width=8)  
(actual time=30280.999..33744.197 rows=2097152 loops=1)
                Sort Key: test.val
                ->  Seq Scan on test  (cost=0.00..23537.52  
rows=2097152 width=8) (actual time=11.550..3262.433 rows=2097152  
loops=1)
Total runtime: 39624.094 ms
(6 rows)

postgres=# set enable_indexscan=on;
postgres=# explain analyze select count(*) from (select distinct on  
(val) * from test where val<10000000) as foo;
                                                                    
QUERY PLAN
------------------------------------------------------------------------ 
-----------------------------------------------------------------------
Aggregate  (cost=4739.58..4739.59 rows=1 width=0) (actual  
time=4686.472..4686.473 rows=1 loops=1)
    ->  Unique  (cost=4380.56..4483.14 rows=20515 width=8) (actual  
time=4609.046..4669.289 rows=19237 loops=1)
          ->  Sort  (cost=4380.56..4431.85 rows=20515 width=8)  
(actual time=4609.041..4627.976 rows=19255 loops=1)
                Sort Key: test.val
                ->  Bitmap Heap Scan on test  (cost=88.80..2911.24  
rows=20515 width=8) (actual time=130.954..4559.244 rows=19255 loops=1)
                      Recheck Cond: (val < 10000000)
                      ->  Bitmap Index Scan on testval   
(cost=0.00..88.80 rows=20515 width=0) (actual time=120.041..120.041  
rows=19255 loops=1)
                            Index Cond: (val < 10000000)
Total runtime: 4690.513 ms
(9 rows)

postgres=# explain select count(*) from (select distinct on (val) *  
from test where val<100000000) as foo;
                                        QUERY PLAN
------------------------------------------------------------------------ 
-----------------
Aggregate  (cost=16350.20..16350.21 rows=1 width=0)
    ->  Unique  (cost=0.00..13748.23 rows=208158 width=8)
          ->  Index Scan using testval on test  (cost=0.00..13227.83  
rows=208158 width=8)
                Index Cond: (val < 100000000)
(4 rows)

postgres=# set enable_indexscan=off;
postgres=# explain analyze select count(*) from (select distinct on  
(val) * from test where val<100000000) as foo;
                                                                       
QUERY PLAN
------------------------------------------------------------------------ 
------------------------------------------------------------------------ 
----
Aggregate  (cost=28081.27..28081.28 rows=1 width=0) (actual  
time=6444.650..6444.651 rows=1 loops=1)
    ->  Unique  (cost=24438.50..25479.29 rows=208158 width=8) (actual  
time=5669.118..6277.206 rows=194142 loops=1)
          ->  Sort  (cost=24438.50..24958.89 rows=208158 width=8)  
(actual time=5669.112..5852.351 rows=194342 loops=1)
                Sort Key: test.val
                ->  Bitmap Heap Scan on test  (cost=882.55..6050.53  
rows=208158 width=8) (actual time=1341.114..4989.840 rows=194342  
loops=1)
                      Recheck Cond: (val < 100000000)
                      ->  Bitmap Index Scan on testval   
(cost=0.00..882.55 rows=208158 width=0) (actual  
time=1339.707..1339.707 rows=194342 loops=1)
                            Index Cond: (val < 100000000)
Total runtime: 6487.114 ms
(9 rows)

postgres=# explain analyze select count(*) from (select distinct on  
(val) * from test where val<750000000) as foo;
                                                                       Q 
UERY PLAN
------------------------------------------------------------------------ 
------------------------------------------------------------------------ 
-------
Aggregate  (cost=204576.53..204576.54 rows=1 width=0) (actual  
time=35718.935..35718.936 rows=1 loops=1)
    ->  Unique  (cost=178717.28..186105.64 rows=1477671 width=8)  
(actual time=29465.856..34459.640 rows=1462348 loops=1)
          ->  Sort  (cost=178717.28..182411.46 rows=1477671 width=8)  
(actual time=29465.853..31658.056 rows=1463793 loops=1)
                Sort Key: test.val
                ->  Bitmap Heap Scan on test  (cost=6256.85..27293.73  
rows=1477671 width=8) (actual time=8316.676..11561.018 rows=1463793  
loops=1)
                      Recheck Cond: (val < 750000000)
                      ->  Bitmap Index Scan on testval   
(cost=0.00..6256.85 rows=1477671 width=0) (actual  
time=8305.963..8305.963 rows=1463793 loops=1)
                            Index Cond: (val < 750000000)
Total runtime: 35736.167 ms
(9 rows)

postgres=# explain analyze select count(*) from (select distinct on  
(val) * from test where val<800000000) as foo;
                                                              QUERY PLAN
------------------------------------------------------------------------ 
------------------------------------------------------------
Aggregate  (cost=217582.20..217582.21 rows=1 width=0) (actual  
time=28718.331..28718.332 rows=1 loops=1)
    ->  Unique  (cost=190140.72..197981.14 rows=1568084 width=8)  
(actual time=22175.170..27380.343 rows=1559648 loops=1)
          ->  Sort  (cost=190140.72..194060.93 rows=1568084 width=8)  
(actual time=22175.165..24451.892 rows=1561181 loops=1)
                Sort Key: test.val
                ->  Seq Scan on test  (cost=0.00..28780.40  
rows=1568084 width=8) (actual time=13.130..3358.923 rows=1561181  
loops=1)
                      Filter: (val < 800000000)
Total runtime: 28735.264 ms
(7 rows)

I did not post any result for the indexscan plan, because it takes to  
much time.
Why the stupid indexscan plan on the whole table ?

Cordialement,
Jean-GĂ©rard Pailloncy


Responses

pgsql-performance by date

Next:From: Luke LonerganDate: 2005-11-24 01:50:57
Subject: Re: Hardware/OS recommendations for large databases (
Previous:From: Alan StangeDate: 2005-11-23 22:00:37
Subject: Re: Hardware/OS recommendations for large databases (

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group