Skip site navigation (1) Skip section navigation (2)

Re: Slow count(*)

From: Brian Hurt <bhurt(at)janestcapital(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Danny Abraham <danny_abraham(at)bmc(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Slow count(*)
Date: 2008-01-03 15:37:31
Message-ID: 477D013B.2010909@janestcapital.com (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-hackers
Kevin Grittner wrote:

> 
>If you really are doing proper maintenance, and you don't need exact
>counts, you might be able to use the approximation stored in the
>system tables:
>  
>

Also, if you're using count(*) as an existance test (common in Mysql 
code), it's better to use exists instead.  Using a table in my system, I 
see:

> proddb=> explain analyze select count(*) from instrument_listings 
> where update_date is null and delist_date is null;
>                                                             QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=18385.45..18385.46 rows=1 width=0) (actual 
> time=897.799..897.801 rows=1 loops=1)
>    ->  Seq Scan on instrument_listings  (cost=0.00..17973.43 
> rows=164807 width=0) (actual time=0.018..634.197 rows=146122 loops=1)
>          Filter: ((update_date IS NULL) AND (delist_date IS NULL))
>  Total runtime: 897.846 ms
> (4 rows)
>
> Time: 898.478 ms
> proddb=> explain analyze select true where exists(select 1 from 
> instrument_listings where update_date is null and delist_date is null 
> limit 1);
>                                                              QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------
>  Result  (cost=0.11..0.12 rows=1 width=0) (actual time=0.026..0.028 
> rows=1 loops=1)
>    One-Time Filter: $0
>    InitPlan
>      ->  Limit  (cost=0.00..0.11 rows=1 width=0) (actual 
> time=0.022..0.022 rows=1 loops=1)
>            ->  Seq Scan on instrument_listings  (cost=0.00..17973.43 
> rows=164807 width=0) (actual time=0.017..0.017 rows=1 loops=1)
>                  Filter: ((update_date IS NULL) AND (delist_date IS NULL))
>  Total runtime: 0.063 ms
> (7 rows)
>
> Time: 0.768 ms
> proddb=>


The exists version is over 1000x faster (and the only reason it's not 
more studly is that I'm working on the table as we speak, so it's all in 
memory).

As a general rule in postgres, don't do count(*) unless you really mean it.

Brian


In response to

pgsql-hackers by date

Next:From: Peter EisentrautDate: 2008-01-03 15:40:13
Subject: Re: Selectivity estimation for equality and range queries
Previous:From: Usama DarDate: 2008-01-03 15:27:23
Subject: Re: Autovacuum & Table List Ordering

pgsql-general by date

Next:From: Albe LaurenzDate: 2008-01-03 15:45:55
Subject: Re: Can't make backup
Previous:From: Glyn AstillDate: 2008-01-03 15:32:59
Subject: Re: Table auditing / Pg/tcl help

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group