Help: 8.0.3 Vacuum of an empty table never completes ...

From: James Robinson <jlrobins(at)socialserve(dot)com>
To: Hackers Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Help: 8.0.3 Vacuum of an empty table never completes ...
Date: 2005-11-28 16:04:39
Message-ID: A408C5A2-6E05-439B-AAD5-71F93ADFDD54@socialserve.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

G'day folks.

We have a production database running 8.0.3 which gets fully
pg_dump'd and vacuum analyze'd hourly by cron. Something strange
happened to us on the 5AM Friday Nov. 25'th cron run -- the:

/usr/local/pgsql/bin/vacuumdb -U postgres --all --analyze --verbose
>& $DATE/vacuum.log

step in our cron procedure never completed. Strange, since no known
event of note happened on Friday since we were all out of the office
past Wed. for the american Thanksgiving holiday. Anyway, running the
vacuum line by hand shows it getting stuck -- processes the majority
of our tables, then just stops, and the backend postmaster just stops
accumulating CPU time.

Comparing the logs further with when it did complete, it seems that
one table in particular (at least) seems afflicted:

social=# vacuum verbose analyze agency.swlog_client;

hangs up forever -- have to control-c the client. Likewise for w/o
'analyze'.

pg-dump'ing the entire database works (phew!) and upon restoring on a
backup box, said table can be vacuum'd:

social=# vacuum verbose analyze agency.swlog_client;
INFO: vacuuming "agency.swlog_client"
INFO: index "swlog_client_pkey" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "swlog_client": found 0 removable, 0 nonremovable row versions
in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "agency.swlog_client"
INFO: "swlog_client": scanned 0 of 0 pages, containing 0 live rows
and 0 dead rows; 0 rows in sample, 0 estimated total rows
VACUUM

That's right -- completely empty table -- which is what we actually
expect.

How should we proceed such that we can learn from this as well as we
can proceed and get our entire database vacuuming again successfully?

Running on Linux 2.6.8-24.18-smp (SuSE 9.2). No juicy filesystem-
related messages in dmesg nor /var/log/messages. 11% disk used on the
postgres-related partition.

The table in question is defined as:

social=# \d agency.swlog_client;
Table "agency.swlog_client"
Column | Type | Modifiers
--------+--------+-----------
swlog | bigint | not null
client | bigint | not null
Indexes:
"swlog_client_pkey" PRIMARY KEY, btree (swlog, client)
Foreign-key constraints:
"$2" FOREIGN KEY (client) REFERENCES agency.client(id)
"$1" FOREIGN KEY (swlog) REFERENCES agency.swlog(id)

And the two fk'd tables:

social=# select count(*) from agency.client;
count
-------
0
(1 row)

social=# select count(*) from agency.swlog;
count
-------
69

We doubt that there could be any strange oddball extremely
longrunning transaction in any of those related tables gumming up
this table.

Finally, the only possibly potentially interesting event database-
wise happened on Wed. Nov. 23'rd -- we SIGHUP'd the postmaster to
have it learn a higher value for work_mem (10240, up from default of
1024). But the hourly crons went great for the subsequent two days.
maintenance_work_mem is still at the default of 16384.

Many thanks in advance!
James

----
James Robinson
Socialserve.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-11-28 16:09:17 Re: Help: 8.0.3 Vacuum of an empty table never completes ...
Previous Message Tom Lane 2005-11-28 15:11:31 Re: Getting different number of results when using hashjoin on/off