Reseting undo/redo logs

From: Edmon Begoli <ebegoli(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Reseting undo/redo logs
Date: 2012-06-21 14:12:55
Message-ID: CAGj+Ysdv0sun-jWyJEEKQ5x1xUPJN0Bu9M=eK7YQsdpAEEZ66Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have this issue on Greenplum which is a MPP hybrid build from
postgres 8.2, and the issue I am seeing is 100% from pg code.

One of the Greenplum segments went down and it cannot recover because
"PANIC XX000 invalid redo/undo record in shutdown checkpoint
(xlog.c:6576)"

I am posting this question here because most casual users of
Postgres/Greenplum are telling me that database is hosed, but I think
that with pg_resetxlog and some
(http://www.postgresql.org/docs/8.2/static/app-pgresetxlog.html) data
loss I could at least "hack" database to come back up.

What I am asking for help here is to help me calculate the reset
values - where to find the most recent valid one and how to
*specifically* calculate the reset ones.

Please advise,
Edmon

2012-06-12 13:16:18.614912
EDT p14611 th802662304 0 seg-1 LOG 0 mirror transition,
primary address(port) 'boxgp10a(41001)' mirror address(port)
'boxgp02a(51001)' mirroring role 'primary role' mirroring state
'change tracking' segment state 'not initialized' process name(pid)
'filerep main process(14611)' filerep state 'not initialized'
0 cdbfilerep.c 3371
2012-06-12 13:16:18.617047
EDT p14612 th802662304 0 seg-1 LOG 0 CHANGETRACKING:
ChangeTracking_RetrieveIsTransitionToInsync() found
insync_transition_completed:'false' full
resync:'false' 0 cdbresynchronizechangetracking.c 2522
2012-06-12 13:16:18.617113
EDT p14612 th802662304 0 seg-1 LOG 0 CHANGETRACKING:
ChangeTracking_RetrieveIsTransitionToResync() found
resync_transition_completed:'false' full
resync:'false' 0 cdbresynchronizechangetracking.c 2559
2012-06-12 13:16:18.746870
EDT p14612 th802662304 0 seg-1 LOG 0 searching for last
checkpoint location for creating the initial resynchronize
changetracking 0 xlog.c 10836
2012-06-12 13:16:18.747318
EDT p14612 th802662304 0 seg-1 LOG 0 record with zero
length at 14/48000070 0 xlog.c 4182
2012-06-12 13:16:18.747491
EDT p14612 th802662304 0 seg-1 LOG 0 scanned through 1
initial xlog records since last checkpoint for writing into the
resynchronize change log 0 cdbresynchronizechangetracking.c 206
2012-06-12 13:16:18.750830
EDT p14624 th802662304 0 seg-1 LOG 0 database system was
shut down at 2012-06-12 11:00:13 EDT 0 xlog.c 6326
2012-06-12 13:16:18.750987
EDT p14624 th802662304 0 seg-1 LOG 0 checkpoint record is
at 14/48000020 0 xlog.c 6425
2012-06-12 13:16:18.751016
EDT p14624 th802662304 0 seg-1 LOG 0 redo record is at
14/48000020; undo record is at 14/42AC2118; shutdown
TRUE 0 xlog.c 6534
2012-06-12 13:16:18.751041
EDT p14624 th802662304 0 seg-1 LOG 0 next transaction ID:
0/4553423; next OID: 241771 0 xlog.c 6538
2012-06-12 13:16:18.751065
EDT p14624 th802662304 0 seg-1 LOG 0 next MultiXactId: 271;
next MultiXactOffset: 549 0 xlog.c 6541
2012-06-12 13:16:18.796637
EDT p14624 th802662304 0 seg-1 PANIC XX000 invalid
redo/undo record in shutdown checkpoint
(xlog.c:6576) 0 xlog.c 6576 "Stack trace:
1 0xa59f75 postgres errstart + 0x595
2 0x50f7ac postgres StartupXLOG + 0x1b8c
3 0x51778d postgres StartupProcessMain + 0x2fd
4 0x590746 postgres AuxiliaryProcessMain + 0x796
5 0x85fe54 postgres <symbol not found> + 0x85fe54
6 0x86003a postgres StartMasterOrPrimaryPostmasterProcesses + 0x3a
7 0x86ffaf postgres doRequestedPrimaryMirrorModeTransitions + 0xd9f
8 0x86bc4a postgres PostmasterMain + 0x1f8a
9 0x772bda postgres main + 0x4da
10 0x2af72ebc7994 libc.so.6 __libc_start_main + 0xf4
11 0x47bf49 postgres <symbol not found> + 0x47bf49

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2012-06-21 14:17:17 Re: Release versioning inconsistency
Previous Message Florian Pflug 2012-06-21 14:05:54 Re: Catalog/Metadata consistency during changeset extraction from wal