Vacuum looping?

From: "Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Vacuum looping?
Date: 2007-07-27 21:32:11
Message-ID: 357fa7590707271432l709da359n68deea693d2c5fdf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Postgres 8.2.4.

We have a large table, let's call it "foo", whereby an automated process
periodically inserts many (hundreds of thousands or millions) rows into it
at a time. It's essentially INSERT INTO foo SELECT FROM <another table>
WHERE <some conditions>. Recently, for whatever reason, the query started
to run out of memory. This happened on the order of 50 times before it was
noticed and the process was stopped. (Admittedly, more investigation needs
to go into the OOM problem... )

Now autovacuum promptly kicked in trying to clean up this mess, however it
couldn't keep up at the rate that dead tuples were being generated. I'm not
sure if it got into a weird state. After a few days, long after the
inserting process was stopped, we decided to abort the vacuum (which we
weren't convinced was doing anything), then start a manual vacuum with a
higher vacuum_cost_limit to get things cleaned up quicker.

After 28 hours, here was the output of vacuum verbose:

# VACUUM VERBOSE foo;
INFO: vacuuming "public.foo"
INFO: scanned index "foo_pkey" to remove 44739062 row versions
DETAIL: CPU 5.74s/26.09u sec elapsed 529.57 sec.
INFO: scanned index "foo_1" to remove 44739062 row versions
DETAIL: CPU 760.09s/619.83u sec elapsed 56929.54 sec.
INFO: scanned index "foo_2" to remove 44739062 row versions
DETAIL: CPU 49.35s/99.57u sec elapsed 4410.74 sec.
INFO: "foo": removed 44739062 row versions in 508399 pages
DETAIL: CPU 47.35s/12.88u sec elapsed 3985.92 sec.
INFO: scanned index "foo_pkey" to remove 32534234 row versions
DETAIL: CPU 22.05s/32.51u sec elapsed 2259.05 sec.

The vacuum then just sat there. What I can't understand is why it went back
for a second pass of the pkey index? There was nothing writing to the table
once the vacuum began. Is this behaviour expected? Are these times
reasonable for a vacuum (on a busy system, mind you)?

We have since aborted the vacuum and truncated the table. We're now working
on the root OOM problem, which is easier said than done...

Steve

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tilmann Singer 2007-07-28 12:52:36 Re: Slow query with backwards index scan
Previous Message Jignesh K. Shah 2007-07-27 20:04:57 Re: User concurrency thresholding: where do I look?