Re: Is there such a thing as a 'background database job'?

From: Vivek Khera <vivek(at)khera(dot)org>
To: pgsql general list <pgsql-general(at)postgresql(dot)org>
Subject: Re: Is there such a thing as a 'background database job'?
Date: 2005-08-25 18:25:27
Message-ID: 70666D1F-AB2E-49CB-B61B-815B8EB01B7C@khera.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Aug 22, 2005, at 10:53 PM, Mike Nolan wrote:

> In a recent discussion with an academician friend of mine regarding
> how
> to improve performance on a system, he came up with the idea of
> taking what
> is now a monthly purge/cleanup job that takes about 24 hours (and
> growing)
> and splitting it up into a series of smaller tasks.
>

Well, the purge/cleanup requires a finite amount of work X. Either
you make that a more efficient process requiring Y < X amount of
time, or you split up the process so that X/N time is used for each
of the N runs of the cleanup, or you do both...

Just splitting the job doesn't mean it will take less time overall,
since it still must do the same total amount of work.

I'm guessing that when you're cleanup is running, it impacts the
performance of the rest of the system, and that is the "faster" you
want to achieve.

My recommendation of what you want to do (and this is what I do) for
your cleanup process is to throttle it and let it run over many days
but pause between smaller tasks, or change your procedure to let you
run your cleanup once per week or per day incrementally.

In my case, I need to purge about 6000 rows from one table, but the
cascading deletes end up removing anywhere from 500 to 200k rows in
other tables per row deleted in the main table. Since I know how
many referenced rows will be removed, I keep a tally and when the
total reaches > 150k rows, I pause for a while. If an individual
rows results in > 10k related rows being deleted, I pause for a
smaller amount of time.

This keeps everything moving along, and *nobody* notices. So what if
it takes 3 days to finish...

Vivek Khera, Ph.D.
+1-301-869-4449 x806

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Vivek Khera 2005-08-25 18:52:20 Re: ctid access is slow
Previous Message Tom Lane 2005-08-25 16:50:11 Re: Help with a subselect inside a view