Vacuum full crash

From: "Mikko Partio" <mpartio(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Vacuum full crash
Date: 2008-03-29 20:40:54
Message-ID: 2ca799770803291340p75d8a4e2o7dfd6c61e73270f6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hello list

an interrupted vacuum full has just caused a PG instance to restart and
recover. Background:

select version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 8.3.1 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20070626 (Red Hat 4.1.2-14)
(1 row)

I have a largish ( >1TB ) database which is kind of a data warehouse.
Recently I had to do some major operations to some of the tables (update all
rows in a table etc) which caused major bloat. To remove the bloat, I run
VACUUM FULL VERBOSE on the bloated tables. Before the vacuum got finished, I
had to abort it due to problems with my own laptop. When I hit ctrl+c to the
vacuum, the PG instance went suddenly to recover mode. The logs showed this:

2008-03-29 22:25:15 EET [26841]: [1-1] ERROR: canceling statement due to
user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT: vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [1-1] ERROR: canceling statement due to
user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT: vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC: cannot abort transaction
3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG: server process (PID 26841) was
terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG: terminating any other active
server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING: terminating connection
because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and exit,
because another server process exited abnormally and possibly corrupted
shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT: In a moment you should be
able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC: cannot abort transaction
3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG: server process (PID 26841) was
terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG: terminating any other active
server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING: terminating connection
because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and exit,
because another server process exited abnormally and possibly corrupted
shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT: In a moment you should be
able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [6476]: [6-1] LOG: archiver process (PID 6489)
exited with exit code 1
2008-03-29 22:25:15 EET [18228]: [1-1] FATAL: the database system is in
recovery mode
2008-03-29 22:25:16 EET [6476]: [7-1] LOG: all server processes terminated;
reinitializing
2008-03-29 22:25:16 EET [18229]: [1-1] LOG: database system was
interrupted; last known up at 2008-03-29 22:20:24 EET
2008-03-29 22:25:16 EET [18229]: [2-1] LOG: database system was not
properly shut down; automatic recovery in progress
2008-03-29 22:25:16 EET [18229]: [3-1] LOG: redo starts at CB5/16399698
2008-03-29 22:25:22 EET [18229]: [4-1] LOG: unexpected pageaddr
CB4/76FF6000 in log file 3253, segment 35, offset 16736256
2008-03-29 22:25:22 EET [18229]: [5-1] LOG: redo done at CB5/23FF4C80
2008-03-29 22:25:22 EET [18229]: [6-1] LOG: last completed transaction was
at log time 2008-03-29 22:22:47.931231+02
2008-03-29 22:25:23 EET [18336]: [1-1] FATAL: the database system is in
recovery mode
2008-03-29 22:25:27 EET [18337]: [1-1] FATAL: the database system is in
recovery mode
2008-03-29 22:25:30 EET [18346]: [1-1] FATAL: the database system is in
recovery mode
2008-03-29 22:25:41 EET [18424]: [1-1] FATAL: the database system is in
recovery mode
2008-03-29 22:25:43 EET [18427]: [1-1] LOG: autovacuum launcher started
2008-03-29 22:25:43 EET [6476]: [8-1] LOG: database system is ready to
accept connections

Seems quite serious to me ("cannot abort a transaction that has already
committed"), what can cause such behaviour?

Regards

Mikko

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2008-03-29 21:05:18 Re: Vacuum full crash
Previous Message Julius Tuskenis 2008-03-29 14:55:51 pg_get_expr