Re: COUNT and Performance ...

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: hs(at)cybertec(dot)at
Cc: pgsql-hackers(at)postgresql(dot)org, neilc(at)samurai(dot)com
Subject: Re: COUNT and Performance ...
Date: 2003-02-02 18:04:14
Message-ID: 12357.1044209054@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= <postgres(at)cybertec(dot)at> writes:
> In special cases there can be another way to avoid seq scans:
> [ use pgstattuple() ]

But pgstattuple does do a sequential scan of the table. You avoid a lot
of the executor's tuple-pushing and plan-node-traversing machinery that
way, but the I/O requirement is going to be exactly the same.

> If people want to count ALL rows of a table. The contrib stuff is pretty
> useful. It seems to be transaction safe.

Not entirely. pgstattuple uses HeapTupleSatisfiesNow(), which means you
get a count of tuples that are committed good in terms of the effects of
transactions committed up to the instant each tuple is examined. This
is in general different from what count(*) would tell you, because it
ignores snapshotting. It'd be quite unrepeatable too, in the face of
active concurrent changes --- it's very possible for pgstattuple to
count a single row twice or not at all, if it's being concurrently
updated and the other transaction commits between the times pgstattuple
sees the old and new versions of the row.

> The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz):

I think your test case is small enough that the whole table is resident
in memory, so this measurement only accounts for CPU time per tuple and
not any I/O. Given the small size of pgstattuple's per-tuple loop, the
speed differential is not too surprising --- but it won't scale up to
larger tables.

Sometime it would be interesting to profile count(*) on large tables
and see exactly where the CPU time goes. It might be possible to shave
off some of the executor overhead ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hans-Jürgen Schönig 2003-02-02 18:18:51 Re: COUNT and Performance ...
Previous Message Hans-Jürgen Schönig 2003-02-02 18:01:58 Re: COUNT and Performance ...