Skip site navigation (1) Skip section navigation (2)

Re: Select max(foo) and select count(*) optimization

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "D'Arcy J(dot)M(dot) Cain" <darcy(at)druid(dot)net>
Cc: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>,Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Select max(foo) and select count(*) optimization
Date: 2004-01-06 23:19:55
Message-ID: 2698.1073431195@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-performance
"D'Arcy J.M. Cain" <darcy(at)druid(dot)net> writes:
> In any case, if I have to vacuum a 20,000,000 row table to get an accurate 
> count then I may as well run count(*) on it.
> (*): Actually I only analyze but I understand that that should be sufficient.

ANALYZE without VACUUM will deliver a not-very-accurate estimate, since it
only looks at a sample of the table's pages and doesn't grovel through
every one.  Any of the VACUUM variants, on the other hand, will set
pg_class.reltuples reasonably accurately (as the number of rows actually
seen and left undeleted by the VACUUM pass).

There are pathological cases where ANALYZE's estimate of the overall row
count can be horribly bad --- mainly, when the early pages of the table
are empty or nearly so, but there are well-filled pages out near the
end.  I have a TODO item to try to make ANALYZE less prone to getting
fooled that way...

			regards, tom lane

In response to

pgsql-performance by date

Next:From: D. Dante LorensoDate: 2004-01-07 16:57:38
Subject: Find original number of rows before applied LIMIT/OFFSET?
Previous:From: Tom LaneDate: 2004-01-06 23:12:35
Subject: Re: PgAdmin startup query VERY slow

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group