Re: Disk space consumed by pk not returned after vacuum or reindex

From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Kevin Johnson <Kevin(dot)Johnson(at)noaa(dot)gov>
Cc: pgsql-admin(at)postgresql(dot)org, Sean Webb <Sean(dot)Webb(at)noaa(dot)gov>
Subject: Re: Disk space consumed by pk not returned after vacuum or reindex
Date: 2006-09-22 17:05:57
Message-ID: 20060922170557.GB571@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, Sep 13, 2006 at 09:53:16 -0400,
Kevin Johnson <Kevin(dot)Johnson(at)noaa(dot)gov> wrote:
>
> We have a database, which consistently consumes more and more of the
> disk space in it's lvol until it reaches 100%. So far, we have tried to
> run a full vacuum on the database, with limited success. Eventually, we
> had to drop and reload the database with the same data inside. It
> brought the disk usage down to 73%. It then began to creep once more
> toward 100%. After some research, I was able to use the pg_class
> catalog to find that the items which are expanding the quickest are
> primary key (btree) indexes. I attempted to run a REINDEX on one of the
> tables with the pk taking up the largest amount of space. The usage
> according to pg_class dropped dramatically, however the disk space was
> not returned to the system. So I attempted another full vacuum
> afterwards, and still nothing was returned to the system. These tables
> are updated extremely frequently (although their total number of rows is
> close to constant), which is my guess as to why the pk indexes increase
> so rapidly in terms of their disk usage. Unfortunately, PostgreSQL
> knowledge is limited, and I was wondering if anyone had experienced
> something similar / knows what else we can do to return this disk space
> back to the system?

This is possibly index bloat due to new keys always being larger than existing
keys. This was fixed in later releases. There is still some potential for
bloat due to fragmentation, but I believe that has a constant bound.
You might try using the cluster command. I think that will both clean up
the indexes and remove dead rows and do it faster than using a vacuum full
and reindexing. The downside is that the table will be unavailable during
the cluster which might be a deal breaker for you.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Lane Van Ingen 2006-09-22 18:09:23 Security Concerns over User 'postgres'
Previous Message Jim C. Nasby 2006-09-22 16:34:46 Re: how to check the creation date of a postgres database?