Skip site navigation (1) Skip section navigation (2)

Re: Very long deletion time on a 200 GB database

From: "Reuven M(dot) Lerner" <reuven(at)lerner(dot)co(dot)il>
To: Glyn Astill <glynastill(at)yahoo(dot)co(dot)uk>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Very long deletion time on a 200 GB database
Date: 2012-02-24 06:39:48
Message-ID: 4F4730B4.2050602@lerner.co.il (view raw or flat)
Thread:
Lists: pgsql-performance
Hi, everyone.

So it turns out that we're not using 25 GB of virtual memory.  (That's 
what I had been shown yesterday, and it was a bit surprising, to say the 
least...)

A few statistics that I managed to get from the Windows 
developers/system administrators:

- The machine has a total of 3.5 GB of RAM
- shared_buffers was set to 256 MB (yes, MB!)
- Virtual memory usage by our process is 3 MB (yes, MB)
- CPU is virtually idle when running the deletes, using about 1% of CPU
- No other processes are accessing the database when we're running the 
maintenance; there are a total of three server processes, but two are idle.

(I was a bit surprised, to say the least, by the low number on 
shared_buffers, given that I believe it's one of the first things I told 
them to increase about 18 months ago.)

As for Tom's point about rules, I know that rules are bad, and I'm not 
sure why the system is using a rule rather than a trigger.    I'll see 
if I can change that to a trigger, but I have very indirect control over 
the machines, and every change requires (believe it or not) writing a 
.NET program that runs my changes, rather than just a textual script 
that deploys them.

The only foreign keys are from the B table (i.e., the table whose 
records I want to remove) to other tables.  There are no REFERENCES 
pointing to the B table.  That said, I hadn't realized that primary keys 
and indexes can also delay the DELETE.

For the latest round of testing, I quadrupled shared_buffers to 1 GB, 
turned off hash joins (as suggested by someone), and also simplified the 
query (based on everyone's suggestions).  In the tests on my own 
computer (with a somewhat random 1 GB snapshot of the 200 GB database), 
the simplified query was indeed much faster, so I'm optimistic.

Several people suggested that chunking the deletes might indeed help, 
which makes me feel a bit better.  Unfortunately, given the time that it 
takes to run the queries, it's hard to figure out the right chunk size. 
  Whoever suggested doing it in time slices had an interesting idea, but 
I'm not sure if it'll be implementable given our constraints.

Thanks again to everyone for your help.  I'll let you know what happens...

Reuven

In response to

Responses

pgsql-performance by date

Next:From: Samuel GendlerDate: 2012-02-24 08:22:21
Subject: Re: Very long deletion time on a 200 GB database
Previous:From: Merlin MoncureDate: 2012-02-23 22:58:01
Subject: Re: Re: [PERFORM] Disable-spinlocks while compiling postgres 9.1 for ARM Cortex A8

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group