Quick Links

Re: Very long deletion time on a 200 GB database

From:	Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To:	David Kerr <dmk(at)mr-paradox(dot)net>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Very long deletion time on a 200 GB database
Date:	2012-02-29 05:31:29
Message-ID:	4F4DB831.90906@archidevsys.co.nz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 29/02/12 06:06, David Kerr wrote:
> On 02/27/2012 12:08 AM, Reuven M. Lerner wrote:
>> Hi, everyone. I wanted to thank you again for your help on the huge
>> delete problem that I was experiencing.
>>
>> After a lot of trial and error, we finally came to the conclusion that
>> deleting this much data in the time frame that they need, on
>> underpowered hardware that is shared with an application, with each test
>> iteration taking 5-9 hours to run (but needing to run in 2-3), is just
>> not going to happen. We tried many of the options that people helpfully
>> suggested here, but none of them gave us the performance that we needed.
>>
>> (One of the developers kept asking me how it can possibly take so long
>> to delete 200 GB, when he can delete files of that size in much less
>> time. I had to explain to him that deleting rows from a database, is a
>> far more complicated task, and can't really be compared to deleting a
>> few files.)
>>
>> In the end, it was agreed that we could execute the deletes over time,
>> deleting items in the background, or in parallel with the application's
>> work. After all, if the disk is filling up at the rate of 2 GB/day, then
>> so long as we delete 4 GB/day (which is pretty easy to do), we should be
>> fine. Adding RAM or another disk are simply out of the question, which
>> is really a shame for a database of this size.
>>
>
> Howdy,
>
> I'm coming a little late to the tread but i didn't see anyone propose
> some tricks I've used in the past to overcome the slow delete problem.
>
> First - if you can drop your FKs, delete, re-create your FKs you'll
> find that you can delete an amazing amount of data very quickly.
>
> second - if you can't do that - you can try function that loops and
> deletes a small amount at a time, this gets around the deleting more
> data then you can fit into memory problem. It's still slow but just
> not as slow.
>
> third - don't delete, instead,
> create new_table as select * from old_table where <records are not the
> ones you want to delete>
> rename new_table to old_table;
> create indexes and constraints
> drop old_table;
>
> fourth - I think some folks mentioned this, but just for completeness,
> partition the table and make sure that your partition key is such that
> you can just drop an entire partition.
>
> Hope that helps and wasn't redundant.
>
> Dave
>
Hi,

I think your first and third points are very obvious - but only after I
had read them! :-)

Your third point is not bad either!

Brilliant simplicity, I hope I can remember them if I run into a similar
situation.

Thanks,
Gavin

In response to

Re: Very long deletion time on a 200 GB database at 2012-02-28 17:06:08 from David Kerr

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Ants Aasma	2012-02-29 07:30:21	Re: problems with set_config, work_mem, maintenance_work_mem, and sorting
Previous Message	Stefan Keller	2012-02-28 23:46:40	Re: PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?