Quick Links

Re: Select max(foo) and select count(*) optimization

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"D'Arcy J(dot)M(dot) Cain" <darcy(at)druid(dot)net>
Cc:	Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>, Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Select max(foo) and select count(*) optimization
Date:	2004-01-06 23:19:55
Message-ID:	2698.1073431195@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

"D'Arcy J.M. Cain" <darcy(at)druid(dot)net> writes:
> In any case, if I have to vacuum a 20,000,000 row table to get an accurate
> count then I may as well run count(*) on it.
> (*): Actually I only analyze but I understand that that should be sufficient.

ANALYZE without VACUUM will deliver a not-very-accurate estimate, since it
only looks at a sample of the table's pages and doesn't grovel through
every one. Any of the VACUUM variants, on the other hand, will set
pg_class.reltuples reasonably accurately (as the number of rows actually
seen and left undeleted by the VACUUM pass).

There are pathological cases where ANALYZE's estimate of the overall row
count can be horribly bad --- mainly, when the early pages of the table
are empty or nearly so, but there are well-filled pages out near the
end. I have a TODO item to try to make ANALYZE less prone to getting
fooled that way...

regards, tom lane

In response to

Re: Select max(foo) and select count(*) optimization at 2004-01-06 22:57:05 from D'Arcy J.M. Cain

Browse pgsql-performance by date

	From	Date	Subject
Next Message	D. Dante Lorenso	2004-01-07 16:57:38	Find original number of rows before applied LIMIT/OFFSET?
Previous Message	Tom Lane	2004-01-06 23:12:35	Re: PgAdmin startup query VERY slow