gcc 4.6 and hot standby

From: Alex Hunsaker <badalex(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: gcc 4.6 and hot standby
Date: 2011-06-08 18:12:48
Message-ID: BANLkTik+zLdJAbT9iL1DS6zFxg2abofbgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

So I've been delaying moving some production boxes over to 9.0.4 from
9.0.2 because hot standby fails with:
(this is on the "hot standby" machine that connects to the master)

2011-06-08 11:40:48 MDT [6072]: [2-1] user= LOG: entering standby mode
2011-06-08 11:40:48 MDT [6072]: [3-1] user= DEBUG: checkpoint record
is at 86/E5D725F0
2011-06-08 11:40:48 MDT [6072]: [4-1] user= DEBUG: redo record is at
86/E39E8248; shutdown FALSE
2011-06-08 11:40:48 MDT [6072]: [5-1] user= DEBUG: next transaction
ID: 0/35456371; next OID: 34090526
2011-06-08 11:40:48 MDT [6072]: [6-1] user= DEBUG: next MultiXactId:
523; next MultiXactOffset: 1046
2011-06-08 11:40:48 MDT [6072]: [7-1] user= DEBUG: oldest unfrozen
transaction ID: 654, in database 1
2011-06-08 11:40:48 MDT [6072]: [8-1] user= DEBUG: transaction ID
wrap limit is 2147484301, limited by database with OID 1
2011-06-08 11:40:48 MDT [6072]: [9-1] user= DEBUG: initializing for hot standby
2011-06-08 11:40:48 MDT [6072]: [10-1] user= LOG: redo starts at 86/E39E8248
2011-06-08 11:40:48 MDT [6072]: [11-1] user= LOG: invalid record
length at 86/E39F2010
2011-06-08 11:40:48 MDT [6074]: [1-1] user= LOG: streaming
replication successfully connected to primary
2011-06-08 11:40:49 MDT [6072]: [12-1] user= LOG: invalid record
length at 86/E3A16010
2011-06-08 11:40:49 MDT [6074]: [2-1] user= FATAL: terminating
walreceiver process due to administrator command
2011-06-08 11:40:49 MDT [6072]: [13-1] user= LOG: invalid record
length at 86/E3A3C010
2011-06-08 11:40:53 MDT [6072]: [14-1] user= LOG: invalid record
length at 86/E3A54010
2011-06-08 11:40:53 MDT [6075]: [1-1] user= FATAL: terminating
walreceiver process due to administrator command
2011-06-08 11:40:53 MDT [6072]: [15-1] user= LOG: invalid record
length at 86/E3A74010
2011-06-08 11:40:58 MDT [6076]: [1-1] user= LOG: streaming
replication successfully connected to primary
2011-06-08 11:40:59 MDT [6072]: [16-1] user= LOG: invalid record
length at 86/E3AC6010
2011-06-08 11:40:59 MDT [6076]: [2-1] user= FATAL: terminating
walreceiver process due to administrator command
2011-06-08 11:40:59 MDT [6072]: [17-1] user= LOG: invalid record
length at 86/E3ACC010
2011-06-08 11:41:03 MDT [6072]: [18-1] user= LOG: invalid record
length at 86/E3B32010
2011-06-08 11:41:03 MDT [6078]: [1-1] user= FATAL: terminating
walreceiver process due to administrator command
[ repeats... ]

Originally I thought there might be some corner case bug in 9.0.3 or
9.0.4. However after recompiling 9.0.2 with gcc 4.6 and hitting the
same problem-- I tried compiling 9.0.4 with gcc 4.5 and it seemed to
work great. I then tired various optimization levels on 4.6:
-O0: works
-O1: works
-O2: fails
-Os: works

I suppose the next step is to narrow it down to a specific flag -O2
uses... But I thought I would post here first-- maybe someone else has
hit this? Or maybe someone has a bright idea on how to narrow this
down?

# linux 2.6.39.1 x86_64 AMD opteron box
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: /build/src/gcc-4.6-20110603/configure --prefix=/usr
--libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/
--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++
--enable-shared --enable-threads=posix --with-system-zlib
--enable-__cxa_atexit --disable-libunwind-exceptions
--enable-clocale=gnu --enable-gnu-unique-object
--enable-linker-build-id --with-ppl --enable-cloog-backend=isl
--enable-lto --enable-gold --enable-ld=default --enable-plugin
--with-plugin-ld=ld.gold --disable-multilib --disable-libstdcxx-pch
--enable-checking=release
Thread model: posix
gcc version 4.6.0 20110603 (prerelease) (GCC)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-06-08 18:19:09 Re: reducing the overhead of frequent table locks - now, with WIP patch
Previous Message Joshua D. Drake 2011-06-08 17:53:12 Re: reducing the overhead of frequent table locks - now, with WIP patch