Re: Very long deletion time on a 200 GB database

From: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To: "Reuven M(dot) Lerner" <reuven(at)lerner(dot)co(dot)il>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Marcin Mańk <marcin(dot)mank(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Very long deletion time on a 200 GB database
Date: 2012-02-23 17:56:14
Message-ID: CAEtnbpUZkdg-gH_AcbTMG41yN+Y0ECXdtnGTdf232B6T3v_1SA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Feb 23, 2012 at 8:25 AM, Reuven M. Lerner <reuven(at)lerner(dot)co(dot)il>wrote:

>
> I've suggested something similar, but was told that we have limited time
> to execute the DELETE, and that doing it in stages might not be possible.
>
>
Just so happens I had this exact problem last week on a rather large table.
* DELETE FROM x WHERE id IN (SELECT id FROM y ...) was horrible at best.
* DELETE FROM x USING y WHERE ... was nearly as bad
Both of the above were taking hours and looking more like it would stretch
into days.

What I discovered however was this little technique that I'm sure all the
Pg gods will smote me for using however it worked for me.

BEGIN;
LOCK x IN SHARE UPDATE EXCLUSIVE; -- to prevent VACUUM's
SELECT x.ctid INTO TEMPORARY TABLE recs_to_delete FROM x,y WHERE x.id=y.id;
DELETE FROM x USING recs_to_delete r WHERE r.ctid=x.ctid;
COMMIT;

I know there are perils in using ctid but with the LOCK it should be safe.
This transaction took perhaps 30 minutes and removed 100k rows and once
the table was VACUUM'd afterward it freed up close to 20 GB on the file
system.

HTH
-Greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alessandro Gagliardi 2012-02-23 17:58:40 Re: set autovacuum=off
Previous Message Thom Brown 2012-02-23 17:45:47 Re: set autovacuum=off