Identifying bloated tables

From: Michal Taborsky - Internet Mall <michal(dot)taborsky(at)mall(dot)cz>
To: pgsql-performance(at)postgresql(dot)org
Subject: Identifying bloated tables
Date: 2006-08-28 14:39:30
Message-ID: 44F30022.1000009@mall.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I just put together a view, which helps us in indentifying which
database tables are suffering from space bloat, ie. they take up much
more space than they actually should. I though this might be useful for
some folk here, because the questions about bloat-related performance
degradation are quite common.

When using this view, you are interested in tables, which have the
"bloat" column higher that say 2.0 (in freshly dump/restored/analyzed
database they should all be around 1.0).

The bloat problem can be one-time fixed either by VACUUM FULL or
CLUSTER, but if the problem is coming back after while, you should
consider doing VACUUM more often or increasing you FSM settings in
postgresql.conf.

I hope I did the view right, it is more or less accurate, for our
purposes (for tables of just few pages the numbers may be off, but then
again, you are usually not much concerned about these tiny 5-page tables
performance-wise).

Hope this helps someone.

Here comes the view.

CREATE OR REPLACE VIEW "public"."relbloat" (
nspname,
relname,
reltuples,
relpages,
avgwidth,
expectedpages,
bloat,
wastedspace)
AS
SELECT pg_namespace.nspname, pg_class.relname, pg_class.reltuples,
pg_class.relpages, rowwidths.avgwidth, ceil(((pg_class.reltuples *
(rowwidths.avgwidth)::double precision) /
(current_setting('block_size'::text))::double precision)) AS
expectedpages,
((pg_class.relpages)::double precision / ceil(((pg_class.reltuples *
(rowwidths.avgwidth)::double precision) /
(current_setting('block_size'::text))::double precision))) AS bloat,
ceil(((((pg_class.relpages)::double precision *
(current_setting('block_size'::text))::double precision) -
ceil((pg_class.reltuples * (rowwidths.avgwidth)::double precision))) /
(1024)::double precision)) AS wastedspace
FROM (((
SELECT pg_statistic.starelid, sum(pg_statistic.stawidth) AS avgwidth
FROM pg_statistic
GROUP BY pg_statistic.starelid
) rowwidths JOIN pg_class ON ((rowwidths.starelid = pg_class.oid)))
JOIN
pg_namespace ON ((pg_namespace.oid = pg_class.relnamespace)))
WHERE (pg_class.relpages > 1);

Bye.

--
Michal Táborský
IT operations chief
Internet Mall, a.s.
<http://www.MALL.cz>

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Brad Nicholson 2006-08-28 14:48:18 Re: Identifying bloated tables
Previous Message Christopher Browne 2006-08-28 14:28:42 Re: Postgre SQL 7.1 cygwin performance issue.