count(*) performance improvement ideas

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: count(*) performance improvement ideas
Date: 2008-03-12 15:04:53
Message-ID: 2e78013d0803120804i4b945f02l7e944d0a079c32c2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I am reading discussion about improving count(*) performance. I have
also seen a TODO for this.

Many people have suggested TRIGGER based solution to the slow count(*)
problem. I looked at the following link which presents the solution
neatly.

http://www.varlena.com/GeneralBits/120.php

But how does that really work for SERIALIZABLE transactions ? If
two concurrent transactions INSERT/DELETE rows from a table,
the trigger execution of one of the transactions is bound to fail
because of concurrent access. Even for READ COMMITTED transactions,
the trigger execution would wait if the other transaction has executed
the trigger on the same table. Well, I think the READ COMMITTED case
can be handled with DEFERRED triggers, but that may require queuing up
too many triggers if there are many inserts/deletes in a transaction.

Running trigger for every insert/delete seems too expensive. I wonder
if we can have a separate "counter" table (as suggested in the TRIGGER
based solution) and track total number of tuples inserted and deleted
in a transaction (and all the committed subtransactions). We then
execute a single UPDATE at the end of the transaction. With HOT,
updating the "counter" table should not be a big pain since all these
updates can potentially be HOT updates. Also since the update of
the "counter" table happens at the commit time, other transactions
inserting/deleting from the same user table may need to wait for a
very small period on the "counter" table tuple.

This still doesn't solve the serializable transaction problem
though. But I am sure we can figure out some solution for that case
as well if we agree on the general approach.

I am sure this must have been discussed before. So what are the
objections ?

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2008-03-12 15:08:55 Re: TODO-list on wiki (was: TODO update about SQLSTATE to PGconn)
Previous Message Bruce Momjian 2008-03-12 15:03:13 Re: TODO-list on wiki (was: TODO update about SQLSTATE to PGconn)