Re: Very long deletion time on a 200 GB database

From: "Reuven M(dot) Lerner" <reuven(at)lerner(dot)co(dot)il>
To: Glyn Astill <glynastill(at)yahoo(dot)co(dot)uk>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Very long deletion time on a 200 GB database
Date: 2012-02-24 06:39:48
Message-ID: 4F4730B4.2050602@lerner.co.il
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi, everyone.

So it turns out that we're not using 25 GB of virtual memory. (That's
what I had been shown yesterday, and it was a bit surprising, to say the
least...)

A few statistics that I managed to get from the Windows
developers/system administrators:

- The machine has a total of 3.5 GB of RAM
- shared_buffers was set to 256 MB (yes, MB!)
- Virtual memory usage by our process is 3 MB (yes, MB)
- CPU is virtually idle when running the deletes, using about 1% of CPU
- No other processes are accessing the database when we're running the
maintenance; there are a total of three server processes, but two are idle.

(I was a bit surprised, to say the least, by the low number on
shared_buffers, given that I believe it's one of the first things I told
them to increase about 18 months ago.)

As for Tom's point about rules, I know that rules are bad, and I'm not
sure why the system is using a rule rather than a trigger. I'll see
if I can change that to a trigger, but I have very indirect control over
the machines, and every change requires (believe it or not) writing a
.NET program that runs my changes, rather than just a textual script
that deploys them.

The only foreign keys are from the B table (i.e., the table whose
records I want to remove) to other tables. There are no REFERENCES
pointing to the B table. That said, I hadn't realized that primary keys
and indexes can also delay the DELETE.

For the latest round of testing, I quadrupled shared_buffers to 1 GB,
turned off hash joins (as suggested by someone), and also simplified the
query (based on everyone's suggestions). In the tests on my own
computer (with a somewhat random 1 GB snapshot of the 200 GB database),
the simplified query was indeed much faster, so I'm optimistic.

Several people suggested that chunking the deletes might indeed help,
which makes me feel a bit better. Unfortunately, given the time that it
takes to run the queries, it's hard to figure out the right chunk size.
Whoever suggested doing it in time slices had an interesting idea, but
I'm not sure if it'll be implementable given our constraints.

Thanks again to everyone for your help. I'll let you know what happens...

Reuven

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Samuel Gendler 2012-02-24 08:22:21 Re: Very long deletion time on a 200 GB database
Previous Message Merlin Moncure 2012-02-23 22:58:01 Re: Re: [PERFORM] Disable-spinlocks while compiling postgres 9.1 for ARM Cortex A8