8.1 count(*) distinct: IndexScan/SeqScan

From: Pailloncy Jean-Gerard <jg(at)rilk(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: 8.1 count(*) distinct: IndexScan/SeqScan
Date: 2005-11-23 22:14:47
Message-ID: 334D9941-B5B3-4D18-8312-F85D0FB054ED@rilk.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

PostgreSQL 8.1 fresh install on a freshly installed OpenBSD 3.8 box.

postgres=# CREATE DATABASE test;
CREATE DATABASE
postgres=# create table test (id serial, val integer);
NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for
serial column "test.id"
CREATE TABLE
postgres=# create unique index testid on test (id);
CREATE INDEX
postgres=# create index testval on test (val);
CREATE INDEX
postgres=# insert into test (val) values (round(random()
*1024*1024*1024));
INSERT 0 1

[...] insert many random values

postgres=# vaccum full verbose analyze;
postgres=# select count(1) from test;
count
---------
2097152
(1 row)

postgres=# explain select count(*) from (select distinct on (val) *
from test) as foo;
QUERY PLAN
------------------------------------------------------------------------
------------------
Aggregate (cost=66328.72..66328.73 rows=1 width=0)
-> Unique (cost=0.00..40114.32 rows=2097152 width=8)
-> Index Scan using testval on test (cost=0.00..34871.44
rows=2097152 width=8)
(3 rows)

postgres=# set enable_indexscan=off;
postgres=# explain analyze select count(*) from (select distinct on
(val) * from test) as foo;
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------
Aggregate (cost=280438.64..280438.65 rows=1 width=0) (actual
time=39604.107..39604.108 rows=1 loops=1)
-> Unique (cost=243738.48..254224.24 rows=2097152 width=8)
(actual time=30281.004..37746.488 rows=2095104 loops=1)
-> Sort (cost=243738.48..248981.36 rows=2097152 width=8)
(actual time=30280.999..33744.197 rows=2097152 loops=1)
Sort Key: test.val
-> Seq Scan on test (cost=0.00..23537.52
rows=2097152 width=8) (actual time=11.550..3262.433 rows=2097152
loops=1)
Total runtime: 39624.094 ms
(6 rows)

postgres=# set enable_indexscan=on;
postgres=# explain analyze select count(*) from (select distinct on
(val) * from test where val<10000000) as foo;

QUERY PLAN
------------------------------------------------------------------------
-----------------------------------------------------------------------
Aggregate (cost=4739.58..4739.59 rows=1 width=0) (actual
time=4686.472..4686.473 rows=1 loops=1)
-> Unique (cost=4380.56..4483.14 rows=20515 width=8) (actual
time=4609.046..4669.289 rows=19237 loops=1)
-> Sort (cost=4380.56..4431.85 rows=20515 width=8)
(actual time=4609.041..4627.976 rows=19255 loops=1)
Sort Key: test.val
-> Bitmap Heap Scan on test (cost=88.80..2911.24
rows=20515 width=8) (actual time=130.954..4559.244 rows=19255 loops=1)
Recheck Cond: (val < 10000000)
-> Bitmap Index Scan on testval
(cost=0.00..88.80 rows=20515 width=0) (actual time=120.041..120.041
rows=19255 loops=1)
Index Cond: (val < 10000000)
Total runtime: 4690.513 ms
(9 rows)

postgres=# explain select count(*) from (select distinct on (val) *
from test where val<100000000) as foo;
QUERY PLAN
------------------------------------------------------------------------
-----------------
Aggregate (cost=16350.20..16350.21 rows=1 width=0)
-> Unique (cost=0.00..13748.23 rows=208158 width=8)
-> Index Scan using testval on test (cost=0.00..13227.83
rows=208158 width=8)
Index Cond: (val < 100000000)
(4 rows)

postgres=# set enable_indexscan=off;
postgres=# explain analyze select count(*) from (select distinct on
(val) * from test where val<100000000) as foo;

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----
Aggregate (cost=28081.27..28081.28 rows=1 width=0) (actual
time=6444.650..6444.651 rows=1 loops=1)
-> Unique (cost=24438.50..25479.29 rows=208158 width=8) (actual
time=5669.118..6277.206 rows=194142 loops=1)
-> Sort (cost=24438.50..24958.89 rows=208158 width=8)
(actual time=5669.112..5852.351 rows=194342 loops=1)
Sort Key: test.val
-> Bitmap Heap Scan on test (cost=882.55..6050.53
rows=208158 width=8) (actual time=1341.114..4989.840 rows=194342
loops=1)
Recheck Cond: (val < 100000000)
-> Bitmap Index Scan on testval
(cost=0.00..882.55 rows=208158 width=0) (actual
time=1339.707..1339.707 rows=194342 loops=1)
Index Cond: (val < 100000000)
Total runtime: 6487.114 ms
(9 rows)

postgres=# explain analyze select count(*) from (select distinct on
(val) * from test where val<750000000) as foo;
Q
UERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-------
Aggregate (cost=204576.53..204576.54 rows=1 width=0) (actual
time=35718.935..35718.936 rows=1 loops=1)
-> Unique (cost=178717.28..186105.64 rows=1477671 width=8)
(actual time=29465.856..34459.640 rows=1462348 loops=1)
-> Sort (cost=178717.28..182411.46 rows=1477671 width=8)
(actual time=29465.853..31658.056 rows=1463793 loops=1)
Sort Key: test.val
-> Bitmap Heap Scan on test (cost=6256.85..27293.73
rows=1477671 width=8) (actual time=8316.676..11561.018 rows=1463793
loops=1)
Recheck Cond: (val < 750000000)
-> Bitmap Index Scan on testval
(cost=0.00..6256.85 rows=1477671 width=0) (actual
time=8305.963..8305.963 rows=1463793 loops=1)
Index Cond: (val < 750000000)
Total runtime: 35736.167 ms
(9 rows)

postgres=# explain analyze select count(*) from (select distinct on
(val) * from test where val<800000000) as foo;
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------
Aggregate (cost=217582.20..217582.21 rows=1 width=0) (actual
time=28718.331..28718.332 rows=1 loops=1)
-> Unique (cost=190140.72..197981.14 rows=1568084 width=8)
(actual time=22175.170..27380.343 rows=1559648 loops=1)
-> Sort (cost=190140.72..194060.93 rows=1568084 width=8)
(actual time=22175.165..24451.892 rows=1561181 loops=1)
Sort Key: test.val
-> Seq Scan on test (cost=0.00..28780.40
rows=1568084 width=8) (actual time=13.130..3358.923 rows=1561181
loops=1)
Filter: (val < 800000000)
Total runtime: 28735.264 ms
(7 rows)

I did not post any result for the indexscan plan, because it takes to
much time.
Why the stupid indexscan plan on the whole table ?

Cordialement,
Jean-Gérard Pailloncy

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2005-11-24 01:50:57 Re: Hardware/OS recommendations for large databases (
Previous Message Alan Stange 2005-11-23 22:00:37 Re: Hardware/OS recommendations for large databases (