Re: Very long deletion time on a 200 GB database

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: sthomas(at)peak6(dot)com
Cc: "Reuven M(dot) Lerner" <reuven(at)lerner(dot)co(dot)il>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Very long deletion time on a 200 GB database
Date: 2012-02-27 14:53:59
Message-ID: 4F4B9907.60008@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 02/27/2012 09:45 AM, Shaun Thomas wrote:
> On 02/27/2012 02:08 AM, Reuven M. Lerner wrote:
>
>> In the end, it was agreed that we could execute the deletes over
>> time, deleting items in the background, or in parallel with the
>> application's work. After all, if the disk is filling up at the rate
>> of 2 GB/day, then so long as we delete 4 GB/day (which is pretty easy
>> to do), we should be fine.
>
> Please tell me you understand deleting rows from a PostgreSQL database
> doesn't work like this. :) The MVCC storage system means you'll
> basically just be marking all those deleted rows as reusable, so your
> database will stop growing, but you'll eventually want to purge all
> the accumulated dead rows.
>
> One way to see how many there are is to use the pgstattuple contrib
> module. You can just call it on the table name in question:
>
> SELECT * FROM pgstattuple('my_table');
>
> You may find that after your deletes are done, you'll have a free_pct
> of 80+%. In order to get rid of all that, you'll need to either run
> CLUSTER on your table(s) or use the select->truncate->insert method
> anyway.
>

If he has autovacuum on he could well be just fine with his proposed
strategy. Or he could have tables partitioned by time and do the delete
by just dropping partitions. There are numerous way he could get this to
work.

cheers

andrew

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Reuven M. Lerner 2012-02-27 14:59:57 Re: Very long deletion time on a 200 GB database
Previous Message Merlin Moncure 2012-02-27 14:53:15 Re: Joining tables by UUID field - very slow