Massive table bloat

From: "Markus Wollny" <Markus(dot)Wollny(at)computec(dot)de>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Massive table bloat
Date: 2010-03-03 15:33:58
Message-ID: 28011CD60FB1724DBA4442E38277F6260F91793D@hermes.computec.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi!

I've set up some system to track slow page executions in one of our (as
yet not live) web apps. The tracking itself is handled completely within
the database using a function. Within a very short time (approx. 1 week)
and although we haven't got that much traffic on our testpages, the
table in question as grown beyond a size of 23 GB, even though a SELECT
count(*) on it will tell me that it only contains 235 rows. I'm sure I
must be missing something obvious here...

Here's the DDL for the table:

CREATE TABLE stats.slowpages
(
url text NOT NULL,
lastexecduration integer NOT NULL,
avgslowexecduration integer,
execcount integer,
lastexectime timestamp without time zone,
site_id integer NOT NULL,
slowestexecduration integer,
totaltimespent bigint,
CONSTRAINT "slowpages_pkey" PRIMARY KEY (url)
)WITHOUT OIDS;

-- Indexes
CREATE INDEX idx_slowpages_duration ON stats.slowpages USING btree
(lastexecduration);
CREATE INDEX idx_slowpages_avgduration ON stats.slowpages USING btree
(avgslowexecduration);
CREATE INDEX idx_slowpages_execcount ON stats.slowpages USING btree
(execcount);
CREATE INDEX idx_slowpages_lastexec ON stats.slowpages USING btree
(lastexectime);
CREATE INDEX idx_slowpages_site ON stats.slowpages USING btree
(site_id);
CREATE UNIQUE INDEX uidx_slowpages_url_site ON stats.slowpages USING
btree (url, site_id);
CREATE INDEX idx_slowpages_totaltimespent ON stats.slowpages USING btree
(totaltimespent);

And this here is the function we use to insert or update entries in this
table:

CREATE or REPLACE FUNCTION "stats"."iou_slowpages"(
IN "_site_id" integer,
IN "_url" text,
IN "_duration" integer)
RETURNS void AS
$BODY$
BEGIN
LOOP

UPDATE stats.slowpages
SET avgslowexecduration =
((avgslowexecduration*execcount)+_duration)/(execcount+1)
, execcount = execcount+1
, lastexectime = now()
, lastexecduration = _duration
, totaltimespent = totaltimespent + _duration
, slowestexecduration = CASE WHEN _duration >
slowestexecduration
THEN _duration ELSE slowestexecduration END
WHERE url = _url AND site_id = _site_id;
IF found THEN
RETURN;
END IF;

BEGIN
INSERT INTO
stats.slowpages(url,lastexecduration,avgslowexecduration,slowestexecdura
tion,totaltimespent,execcount,lastexectime,site_id)
VALUES (_url, _duration, _duration,_duration,_duration, 1,
now(), _site_id);
RETURN;
EXCEPTION WHEN unique_violation THEN

END;
END LOOP;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;

_site_id is a small integer value, _url is a full URL string to a page
and _duration is a value in milliseconds. We're on PostgreSQL 8.3.7.

Any idea about what I may be missing here?

Kind regards

Markus

Computec Media AG
Sitz der Gesellschaft und Registergericht: Furth (HRB 8818)
Vorstandsmitglieder: Albrecht Hengstenberg (Vorsitzender) und Rainer Rosenbusch
Vorsitzender des Aufsichtsrates: Jurg Marquard
Umsatzsteuer-Identifikationsnummer: DE 812 575 276

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ivan Sergio Borgonovo 2010-03-03 15:41:26 Re: bug in function arguments "recognition"
Previous Message Ivan Sergio Borgonovo 2010-03-03 15:31:30 Re: bug in function arguments "recognition"