Re: Performance Issues with count()

From: Mark kirkwood <markir(at)slingshot(dot)co(dot)nz>
To: S Grannis <sjg(at)email(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Performance Issues with count()
Date: 2002-04-26 08:39:36
Message-ID: 1019810377.1267.89.camel@spikey.slithery.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I wonder if using a summary table could help you here...

for example :

suppose you have a table - big_table(id,...,interesting_value)

and you want to provide counts of interesting_value quickly.

The plan is to create another table -
summ_table(interesting_value,...,its_count)

and keep it up to date with big_table via triggers.

Then any query of the form :

SELECT interesting_value, count(*)
FROM big_table
WHERE ...
GROUP BY interesting value;

can be answered by :

SELECT interesting_value,its_count
FROM summ_table
WHERE ...;

which is generally *much* faster.
(clearly a complete count is easy to answer quickly too...)

I must say that I have not used this technique in Postgresql (I have
used it in DB2 and Oracle). However this issue is typical for large
databases of all flavours (i.e its too slow to scan and count a lot of
values for each query.... so you do it once and save the results for
future reference).

In addition the rule system in Postgresql *might* be ameniable to
providing a sort of 'query rewrite' to automatically make (some) queries
on big_table go to summ_table instead...(have not tried this... but you
never know until you try)

best wishes

Mark

P.s : the trigger code to keep big_table and summ_table in sync is the
hard bit... but there are no doubt many folks on this list ...incl even
me.. who will happily help you out here.

Browse pgsql-general by date

  From Date Subject
Next Message Joseph Artsimovich 2002-04-26 09:50:15 bit datatype in 7.2.1
Previous Message Uros Gruber 2002-04-26 06:53:07 Re: Auth problem in pg_hdba.conf