Skip site navigation (1) Skip section navigation (2)

Vacuum full crash

From: "Mikko Partio" <mpartio(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Vacuum full crash
Date: 2008-03-29 20:40:54
Message-ID: 2ca799770803291340p75d8a4e2o7dfd6c61e73270f6@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-admin
Hello list

an interrupted vacuum full has just caused a PG instance to restart and
recover. Background:

select version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 8.3.1 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20070626 (Red Hat 4.1.2-14)
(1 row)

I have a largish ( >1TB ) database which is kind of a data warehouse.
Recently I had to do some major operations to some of the tables (update all
rows in a table etc) which caused major bloat. To remove the bloat, I run
VACUUM FULL VERBOSE on the bloated tables. Before the vacuum got finished, I
had to abort it due to problems with my own laptop. When I hit ctrl+c to the
vacuum, the PG instance went suddenly to recover mode. The logs showed this:

2008-03-29 22:25:15 EET [26841]: [1-1] ERROR:  canceling statement due to
user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT:  vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [1-1] ERROR:  canceling statement due to
user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT:  vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC:  cannot abort transaction
3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG:  server process (PID 26841) was
terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG:  terminating any other active
server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING:  terminating connection
because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL:  The postmaster has
commanded this server process to roll back the current transaction and exit,
because another server process exited abnormally and possibly corrupted
shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT:  In a moment you should be
able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC:  cannot abort transaction
3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG:  server process (PID 26841) was
terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG:  terminating any other active
server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING:  terminating connection
because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL:  The postmaster has
commanded this server process to roll back the current transaction and exit,
because another server process exited abnormally and possibly corrupted
shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT:  In a moment you should be
able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [6476]: [6-1] LOG:  archiver process (PID 6489)
exited with exit code 1
2008-03-29 22:25:15 EET [18228]: [1-1] FATAL:  the database system is in
recovery mode
2008-03-29 22:25:16 EET [6476]: [7-1] LOG:  all server processes terminated;
reinitializing
2008-03-29 22:25:16 EET [18229]: [1-1] LOG:  database system was
interrupted; last known up at 2008-03-29 22:20:24 EET
2008-03-29 22:25:16 EET [18229]: [2-1] LOG:  database system was not
properly shut down; automatic recovery in progress
2008-03-29 22:25:16 EET [18229]: [3-1] LOG:  redo starts at CB5/16399698
2008-03-29 22:25:22 EET [18229]: [4-1] LOG:  unexpected pageaddr
CB4/76FF6000 in log file 3253, segment 35, offset 16736256
2008-03-29 22:25:22 EET [18229]: [5-1] LOG:  redo done at CB5/23FF4C80
2008-03-29 22:25:22 EET [18229]: [6-1] LOG:  last completed transaction was
at log time 2008-03-29 22:22:47.931231+02
2008-03-29 22:25:23 EET [18336]: [1-1] FATAL:  the database system is in
recovery mode
2008-03-29 22:25:27 EET [18337]: [1-1] FATAL:  the database system is in
recovery mode
2008-03-29 22:25:30 EET [18346]: [1-1] FATAL:  the database system is in
recovery mode
2008-03-29 22:25:41 EET [18424]: [1-1] FATAL:  the database system is in
recovery mode
2008-03-29 22:25:43 EET [18427]: [1-1] LOG:  autovacuum launcher started
2008-03-29 22:25:43 EET [6476]: [8-1] LOG:  database system is ready to
accept connections


Seems quite serious to me ("cannot abort a transaction that has already
committed"), what can cause such behaviour?

Regards

Mikko

Responses

pgsql-admin by date

Next:From: Tom LaneDate: 2008-03-29 21:05:18
Subject: Re: Vacuum full crash
Previous:From: Julius TuskenisDate: 2008-03-29 14:55:51
Subject: pg_get_expr

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group