Re: count(*) slow on large tables

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Christopher Browne <cbbrowne(at)libertyrms(dot)info>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: count(*) slow on large tables
Date: 2003-10-04 15:56:40
Message-ID: 200310041556.h94Fuek24423@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Christopher Browne wrote:
> jllachan(at)nsd(dot)ca (Jean-Luc Lachance) writes:
> > That's one of the draw back of MVCC.
> > I once suggested that the transaction number and other house keeping
> > info be included in the index, but was told to forget it...
> > It would solve once and for all the issue of seq_scan vs index_scan.
> > It would simplify the aggregate problem.
>
> It would only simplify _one_ case, namely the case where someone cares
> about the cardinality of a relation, and it would do that at
> _considerable_ cost.
>
> A while back I outlined how this would have to be done, and for it to
> be done efficiently, it would be anything BUT simple.
>
> It would be very hairy to implement it correctly, and all this would
> cover is the single case of "SELECT COUNT(*) FROM SOME_TABLE;"

We do have a TODO item:

* Consider using MVCC to cache count(*) queries with no WHERE clause

The idea is to cache a recent count of the table, then have
insert/delete add +/- records to the count. A COUNT(*) would get the
main cached record plus any visible +/- records. This would allow the
count to return the proper value depending on the visibility of the
requesting transaction, and it would require _no_ heap or index scan.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-10-04 16:07:53 COUNT(*) again (was Re: [HACKERS] Index/Function organized table layout)
Previous Message Tom Lane 2003-10-04 15:45:28 Re: Using backslash in query

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-10-04 16:07:53 COUNT(*) again (was Re: [HACKERS] Index/Function organized table layout)
Previous Message Andrew Sullivan 2003-10-04 15:23:38 Re: reindex/vacuum locking/performance?