Speeding up aggregates

From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Speeding up aggregates
Date: 2002-12-06 17:30:46
Message-ID: web-2024223@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Folks,

One of Postgres' poorest performing areas is aggregates. This is the
unfortunate side effect of our fully extensible aggregate and type
system. However, I thought that the folks on this list might have a
few tips on making aggregates perform faster.

Here's mine: Aggregate Caching Table

This is a brute-force approach. However, if you have a table with a
million records for which users *frequently* ask for grand totals or
counts, it can work fine.

A simple example:

Table client_case_counts (
client_id INT NOT NULL REFERENCES clients(client_id) ON DELETE
CASCADE;
no_cases INT NOT NULL DEFAULT 0
);

Then create triggers:

Function tf_maintain_client_counts ()
returns opaque as '
BEGIN
UPDATE client_case_counts SET no_cases = no_cases + 1
WHERE client_id = NEW.client_id;
INSERT INTO client_case_counts ( client_id, no_cases )
VALUES ( NEW.client_id, 1 )
WHERE NOT EXISTS (SELECT client_id FROM client_case_counts ccc2
WHERE ccc2.client_id = NEW.client_id);
RETURN NEW;
END;' LANGUAGE 'plpgsql';

Trigger tg_maintain_client_counts ON INSERT INTO cases
FOR EACH ROW EXECUTE tf_maintain_client_counts();
etc.

While effective, this approach is costly in terms of update/insert
processing. It is also limited to whatever aggregate requests you have
anticipated ... it does no good for aggregates over a user-defined
range.

What have other Postgres users done to speed up aggregates on large
tables?

-Josh Berkus

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Joe Conway 2002-12-06 18:10:45 Re: Speeding up aggregates
Previous Message john cartmell 2002-12-06 11:32:04 Re: ORDER BY ... LIMIT.. performance