Re: Slow count(*)

From: Brian Hurt <bhurt(at)janestcapital(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Danny Abraham <danny_abraham(at)bmc(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Slow count(*)
Date: 2008-01-03 15:37:31
Message-ID: 477D013B.2010909@janestcapital.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Kevin Grittner wrote:

>
>If you really are doing proper maintenance, and you don't need exact
>counts, you might be able to use the approximation stored in the
>system tables:
>
>

Also, if you're using count(*) as an existance test (common in Mysql
code), it's better to use exists instead. Using a table in my system, I
see:

> proddb=> explain analyze select count(*) from instrument_listings
> where update_date is null and delist_date is null;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=18385.45..18385.46 rows=1 width=0) (actual
> time=897.799..897.801 rows=1 loops=1)
> -> Seq Scan on instrument_listings (cost=0.00..17973.43
> rows=164807 width=0) (actual time=0.018..634.197 rows=146122 loops=1)
> Filter: ((update_date IS NULL) AND (delist_date IS NULL))
> Total runtime: 897.846 ms
> (4 rows)
>
> Time: 898.478 ms
> proddb=> explain analyze select true where exists(select 1 from
> instrument_listings where update_date is null and delist_date is null
> limit 1);
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------
> Result (cost=0.11..0.12 rows=1 width=0) (actual time=0.026..0.028
> rows=1 loops=1)
> One-Time Filter: $0
> InitPlan
> -> Limit (cost=0.00..0.11 rows=1 width=0) (actual
> time=0.022..0.022 rows=1 loops=1)
> -> Seq Scan on instrument_listings (cost=0.00..17973.43
> rows=164807 width=0) (actual time=0.017..0.017 rows=1 loops=1)
> Filter: ((update_date IS NULL) AND (delist_date IS NULL))
> Total runtime: 0.063 ms
> (7 rows)
>
> Time: 0.768 ms
> proddb=>

The exists version is over 1000x faster (and the only reason it's not
more studly is that I'm working on the table as we speak, so it's all in
memory).

As a general rule in postgres, don't do count(*) unless you really mean it.

Brian

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Albe Laurenz 2008-01-03 15:45:55 Re: Can't make backup
Previous Message Glyn Astill 2008-01-03 15:32:59 Re: Table auditing / Pg/tcl help

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2008-01-03 15:40:13 Re: Selectivity estimation for equality and range queries
Previous Message Usama Dar 2008-01-03 15:27:23 Re: Autovacuum & Table List Ordering