Re: Recovering real disk space

From: "Guy Rouillier" <guyr(at)masergy(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Recovering real disk space
Date: 2005-04-03 06:59:43
Message-ID: CC1CF380F4D70844B01D45982E671B2348E7AA@mtxexch01.add0.masergy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Adam Siegel wrote:
> We have a system that archives data to a postgres database. The raw
> data is confined to one table. Each record in the table is generally
> 1500 bytes. Each record is also associated with a volume name.
> During normal operations, many millions of rows are written to this
> table. After sometime the disk gets close to being full.
>
> We have a program that allows the user to offload rows from a table
> based upon volume name to a CD. This is done by doing a select on the
> table and then creating a memento record for each row and persisting
> it to the CD. Once the rows are written to the CD a delete is
> performed on the table for the off-loaded rows.
>
> select * from packets where volume = 'abc';
>
> ... process to write to CD ...
>
> delete from packets where volume = 'abc';
>
> We perform a vacuum full after each mass delete. This cycle can
> happen many times during over a couple of weeks. We are in a test lab
> environment and are generating a lot of data.
>
> One of the problems we have is that the vacuum on the table can take
> up to 10 hours. We also expect to see the physical disk space go
> down, but this does not happen. If we accidently fill up the disk,
> then all bets are off and we are unable to recover. A vacuum never
> seems to finish (several days).
>
> How can we physically recover "real" disk space for the rows that were
> deleted from the table? I've heard about free space buffers, but am
> not really sure how they work. Are there configuration items that
> can be tweaked to help with vacuuming large tables?

You mention millions of rows. You realize that attempting to delete all
those rows will use up massive amounts of transaction log space, right?
We have a data collection system that stores about 2 million rows a day.
To avoid this issue, we have separate tables for each month. We want to
keep N months available online. So when a new month starts, we save off
table N-1 to backup and then just drop the table. You may want to
consider something like this; if you don't have too many volumes, have
separate tables by volume by month.

>
> Thanks!!!!
>
>
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 2: you can get off all
> lists at once with the unregister command (send "unregister
> YourEmailAddressHere" to majordomo(at)postgresql(dot)org)

--
Guy Rouillier

Browse pgsql-general by date

  From Date Subject
Next Message Guy Rouillier 2005-04-03 07:02:08 Re: Debugging deadlocks
Previous Message Patrick TJ McPhee 2005-04-03 04:23:31 Re: Recovering real disk space