Recovery from multi trouble

From: OKADA Satoshi <okada(dot)satoshi(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Recovery from multi trouble
Date: 2005-12-19 05:20:35
Message-ID: 43A64323.2010502@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

I am testing the durability of PostgreSQL. When a part of log
was lost due to the trouble(power supply trouble etc.),
I tested about the recovery of PostgreSQL.

The loss of log was simulated by deleting the latest xlog file.

postmaster startup, it only output log message "No such file or
directory" while recovering database.
Here, the client can be connected with the database.I tried to
access table by "psql", but could not access it or all query was lost.

When xlog was lost or destroyed, I think that postmaster
should output PANIC message, and interrupt startup process.
(At least, I think that log level should be "Error".)

How do you think?

My operations are as follows.

--------
PostgreSQL Version: 8.1.1
OS: Fedora Core 4

* create database and table
> createdb testdb
CREATE DATABASE
> pgbench -d testdb -i

* database access by using pgbench
> pgbench testdb -t 10000

* kill postgresql processes before end of pgbench execution (simulate trouble)
> ps -ef | grep post
postgres 18901 1 0 03:10 pts/2 00:00:00 /home/postgres/Work/pg811/bin/postmaster
postgres 18903 18901 0 03:10 pts/2 00:00:00 postgres: writer process
postgres 18904 18901 0 03:10 pts/2 00:00:00 postgres: stats buffer process
postgres 18905 18904 0 03:10 pts/2 00:00:00 postgres: stats collector process
postgres 18947 18901 59 03:11 pts/2 00:00:01 postgres: postgres testdb [local] UPDATE
postgres 18949 18918 0 03:11 pts/3 00:00:00 grep post
> kill -9 18901; kill -9 18947

* latest xlog remove (sumilate xlog trouble)
> cd $PGDATA/pg_xlog; ls
000000010000000000000000 000000010000000000000001 archive_status/
> mv 000000010000000000000001 /tmp/.

* postmaster start(recovery start)
> pg_ctl -l pg811_1.log start

* check tables of pgbench
> psql testdb
testdb=# SELECT * from history ;
tid | bid | aid | delta | mtime | filler
-----+-----+-----+-------+-------+--------
(0 rows)

testdb=# SELECT * from accounts ;
ERROR: xlog flush request 0/1200F88 is not satisfied flushed
only to 0/1000060
CONTEXT: writing block 12 of relation 1663/16384/2619

* contnt of postmaster log file, pg811_1.log
---
LOG: database system was interrupted at 2005-12-14 03:10:51 JST
LOG: checkpoint record is at 0/33AC48
LOG: redo record is at 0/33AC48; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 567; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 0/33AC8C
LOG: could not open file "pg_xlog/000000010000000000000001" (log file 0, segment 1): no such file or direcotry
LOG: redo done at 0/FFFF5C
LOG: database system is ready
LOG: transaction ID wrap limit is 1073742415, limited by database "testdb"
ERROR: xlog flush request 0/1200F88 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 12 of relation 1663/16384/2619
ERROR: xlog flush request 0/11FDE40 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 4 of relation 1663/16384/2619
ERROR: xlog flush request 0/1205938 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 1 of relation 1663/16384/2696
ERROR: xlog flush request 0/11F0DB4 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 9 of relation 1663/16384/2619
ERROR: xlog flush request 0/11F0290 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 1 of relation 1663/16384/2619
ERROR: xlog flush request 0/1206B80 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 2 of relation 1663/16384/2696
ERROR: xlog flush request 0/1200F88 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 12 of relation 1663/16384/2619
WARNING: could not write block 12 of 1663/16384/2619
DETAIL: Multiple failures --- write error may be permanent.
ERROR: xlog flush request 0/11FDE40 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 4 of relation 1663/16384/2619
WARNING: could not write block 4 of 1663/16384/2619
DETAIL: Multiple failures --- write error may be permanent.
ERROR: xlog flush request 0/12D69A0 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 0 of relation 1663/16384/16391
ERROR: xlog flush request 0/134F6D4 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 1 of relation 1663/16384/16391
ERROR: xlog flush request 0/139AF10 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 2 of relation 1663/16384/16391
ERROR: xlog flush request 0/13CC814 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 3 of relation 1663/16384/16391
LOG: received smart shutdown request
LOG: shutting down
PANIC: xlog flush request 0/1205938 is not satisfied --- flushed only to 0/1000060
CONTEXT: writing block 1 of relation 1663/16384/2696
LOG: background writer process (PID 18969) was terminated by signal 6
LOG: terminating any other active server processes
LOG: all server processes terminated; reinitializing
LOG: database system shutdown was interrupted at 2005-12-14 03:14:20 JST
LOG: checkpoint record is at 0/100001C
LOG: redo record is at 0/100001C; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 587; next OID: 24576
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic recovery in progress
LOG: record with zero length at 0/1000060
LOG: redo is not required
LOG: database system is ready
LOG: transaction ID wrap limit is 1073742415, limited by database "testdb"
LOG: shutting down
LOG: database system is shut down
----

---
Okada Satoshi

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2005-12-19 05:20:40 Re: [pgadmin-hackers] Client-side password encryption
Previous Message Alvaro Herrera 2005-12-19 04:29:36 Re: Does VACUUM reorder tables on clustered indices