Measuring table and index bloat

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Measuring table and index bloat
Date: 2007-12-08 07:06:46
Message-ID: Pine.GSO.4.64.0712080127420.5116@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

One of those things that comes up regularly on this list in particular are
people whose performance issues relate to "bloated" tables or indexes.
What I've always found curious is that I've never seen a good way
suggested to actually measure said bloat in any useful numeric
terms--until today.

Greg Sabino Mullane just released a Nagios plug-in for PostgreSQL that you
can grab at http://bucardo.org/nagios_postgres/ , and while that is itself
nice the thing I found most remarkable is the bloat check. The majority
of that code is an impressive bit of SQL that anyone could use even if you
have no interest in Nagios, which is why I point it out for broader
attention. Look in check_postgres.pl for the "check_bloat" routine and
the big statement starting at the aptly labled "This was fun to write"
section. If you pull that out of there and replace $MINPAGES and
$MINIPAGES near the end with real values, you can pop that into a
standalone query and execute it directly. Results look something like
this (reformatting for e-mail):

schemaname | tablename | reltuples | relpages | otta | tbloat |
public | accounts | 2500000 | 41667 | 40382 | 1.0 |

wastedpages | wastedbytes | wastedsize | iname | ituples |
1285 | 10526720 | 10 MB | accounts_pkey | 2500000 |

ipages | iotta | ibloat | wastedipages | wastedibytes | wastedisize
5594 | 35488 | 0.2 | 0 | 0 | 0 bytes

I'd be curious to hear from those of you who have struggled with this
class of problem in the past as to whether you feel this quantifies the
issue usefully.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alvaro Herrera 2007-12-08 13:21:23 Re: Cost-Based Vacuum Delay tuning
Previous Message Ron Mayer 2007-12-07 20:46:21 Re: TB-sized databases