Skip site navigation (1) Skip section navigation (2)

Re: Very long deletion time on a 200 GB database

From: "Reuven M(dot) Lerner" <reuven(at)lerner(dot)co(dot)il>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Very long deletion time on a 200 GB database
Date: 2012-02-27 08:08:15
Message-ID: 4F4B39EF.2080004@lerner.co.il (view raw or flat)
Thread:
Lists: pgsql-performance
Hi, everyone.  I wanted to thank you again for your help on the huge 
delete problem that I was experiencing.

After a lot of trial and error, we finally came to the conclusion that 
deleting this much data in the time frame that they need, on 
underpowered hardware that is shared with an application, with each test 
iteration taking 5-9 hours to run (but needing to run in 2-3), is just 
not going to happen.  We tried many of the options that people helpfully 
suggested here, but none of them gave us the performance that we needed.

(One of the developers kept asking me how it can possibly take so long 
to delete 200 GB, when he can delete files of that size in much less 
time.  I had to explain to him that deleting rows from a database, is a 
far more complicated task, and can't really be compared to deleting a 
few files.)

In the end, it was agreed that we could execute the deletes over time, 
deleting items in the background, or in parallel with the application's 
work.  After all, if the disk is filling up at the rate of 2 GB/day, 
then so long as we delete 4 GB/day (which is pretty easy to do), we 
should be fine.  Adding RAM or another disk are simply out of the 
question, which is really a shame for a database of this size.

I should add that it was interesting/amusing to see the difference 
between the Unix and Windows philosophies.  Each time I would update my 
pl/pgsql functions, the Windows guys would wrap it into a string, inside 
of a .NET program, which then needed to be compiled, installed, and run. 
  (Adding enormous overhead to our already long testing procedure.)  I 
finally managed to show them that we could get equivalent functionality, 
with way less overhead, by just running psql -f FILENAME.  This version 
doesn't have fancy GUI output, but it works just fine...

I always tell people that PostgreSQL is not just a great database, but a 
fantastic, helpful community.  Thanks to everyone for their suggestions 
and advice.

Reuven

In response to

Responses

pgsql-performance by date

Next:From: Wales WangDate: 2012-02-27 12:46:38
Subject: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?
Previous:From: lephongvuDate: 2012-02-27 04:33:08
Subject: Re: Very long deletion time on a 200 GB database

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group