Re: Completely broken replica after PANIC: WAL contains references to invalid pages

From: "anarazel(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>,pgsql-bugs(at)postgresql(dot)org
Cc: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>, Panchenko(at)gw(dot)tander(dot)ru, tolstenko_iv(at)gw(dot)tander(dot)ru
Subject: Re: Completely broken replica after PANIC: WAL contains references to invalid pages
Date: 2013-03-29 20:52:47
Message-ID: 675b7cee-b7f0-4e32-8e34-1efaf3ca5fe9@email.android.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi

Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> schrieb:

>Hi all,
>
>A couple of days ago I found the replica stopped after the PANIC
>message:
>
>PANIC: WAL contains references to invalid pages
>
>When I tried to restart it I got this FATAL:
>
>FATAL: could not access status of transaction 280557568
>
>Below is the description of the server and information from PostgreSQL
>and system logs. After googling the problem I have found nothing like
>this.
>
>Any thoughts of what it could be and how to prevent it in the future?

I think I See whats going on. Do you still have the datadir available? If so, could you send the pg_controldata output?

Andres

>Hardware:
>
>IBM System x3650 M4, 148GB RAM, NAS
>
>Software:
>
>PostgreSQL 9.2.3, yum.postgresql.org
>CentOS 6.3, kernel 2.6.32-279.22.1.el6.x86_64
>
>Configuration:
>
>listen_addresses = '*'
>max_connections = 550
>shared_buffers = 35GB
>work_mem = 256MB
>maintenance_work_mem = 1GB
>bgwriter_delay = 10ms
>bgwriter_lru_multiplier = 10.0
>effective_io_concurrency = 32
>wal_level = hot_standby
>synchronous_commit = off
>checkpoint_segments = 1024
>checkpoint_timeout = 1h
>checkpoint_completion_target = 0.9
>checkpoint_warning = 5min
>max_wal_senders = 3
>wal_keep_segments = 2048
>hot_standby = on
>max_standby_streaming_delay = 5min
>hot_standby_feedback = on
>effective_cache_size = 133GB
>log_directory = '/var/log/pgsql'
>log_filename = 'postgresql-%Y-%m-%d.log'
>log_checkpoints = on
>log_line_prefix = '%t %p %u(at)%d from %h [vxid:%v txid:%x] [%i] '
>log_lock_waits = on
>log_statement = 'ddl'
>log_timezone = 'W-SU'
>track_activity_query_size = 4096
>autovacuum_max_workers = 5
>autovacuum_naptime = 5s
>autovacuum_vacuum_scale_factor = 0.05
>autovacuum_analyze_scale_factor = 0.05
>autovacuum_vacuum_cost_delay = 5ms
>datestyle = 'iso, dmy'
>timezone = 'W-SU'
>lc_messages = 'en_US.UTF-8'
>lc_monetary = 'ru_RU.UTF-8'
>lc_numeric = 'ru_RU.UTF-8'
>lc_time = 'ru_RU.UTF-8'
>default_text_search_config = 'pg_catalog.russian'
>
>System:
>
># Controls the maximum shared segment size, in bytes
>kernel.shmmax = 53287555072
>
># Controls the maximum number of shared memory segments, in pages
>kernel.shmall = 13009657
>
># Maximum number of file-handles
>fs.file-max = 65535
>
># pdflush tuning to prevent lag spikes
>vm.dirty_ratio = 10
>vm.dirty_background_ratio = 1
>vm.dirty_expire_centisecs = 499
>
># Prevent the scheduler breakdown
>kernel.sched_migration_cost = 5000000
>
># Turned off to provide more CPU to PostgreSQL
>kernel.sched_autogroup_enabled = 0
>
># Setup hugepages
>vm.hugetlb_shm_group = 26
>vm.hugepages_treat_as_movable = 0
>vm.nr_overcommit_hugepages = 512
>
># The Huge Page Size is 2048kB, so for 35GB shared buffers the number
>is 17920
>vm.nr_hugepages = 17920
>
># Turn off the NUMA local pages reclaim as it leads to wrong caching
>strategy for databases
>vm.zone_reclaim_mode = 0
>
>Environment:
>
>HUGETLB_SHM=yes
>LD_PRELOAD='/usr/lib64/libhugetlbfs.so'
>export HUGETLB_SHM LD_PRELOAD
>
>When it is stopped:
>
>2013-03-26 11:50:32 MSK 3775 @ from [vxid: txid:0] [] LOG:
>restartpoint complete: wrote 1685004 buffers (36.7%); 0 transaction
>log file(s) added, 0 removed, 555 recycled; write=3237.402 s,
>sync=0.071 s, total=3237.507 s; sync files=2673, longest=0.008 s,
>average=0.000 s
>2013-03-26 11:50:32 MSK 3775 @ from [vxid: txid:0] [] LOG: recovery
>restart point at 2538/6E154AC0
>2013-03-26 11:50:32 MSK 3775 @ from [vxid: txid:0] [] DETAIL: last
>completed transaction was at log time 2013-03-26 11:50:31.613948+04
>2013-03-26 11:50:32 MSK 3775 @ from [vxid: txid:0] [] LOG:
>restartpoint starting: xlog
>2013-03-26 11:51:16 MSK 3773 @ from [vxid:1/0 txid:0] [] WARNING:
>page 451 of relation base/16436/2686702648 is uninitialized
>2013-03-26 11:51:16 MSK 3773 @ from [vxid:1/0 txid:0] [] CONTEXT:
>xlog redo vacuum: rel 1663/16436/2686702648; blk 2485,
>lastBlockVacuumed 0
>2013-03-26 11:51:16 MSK 3773 @ from [vxid:1/0 txid:0] [] PANIC: WAL
>contains references to invalid pages
>2013-03-26 11:51:16 MSK 3773 @ from [vxid:1/0 txid:0] [] CONTEXT:
>xlog redo vacuum: rel 1663/16436/2686702648; blk 2485,
>lastBlockVacuumed 0
>2013-03-26 11:51:16 MSK 3770 @ from [vxid: txid:0] [] LOG: startup
>process (PID 3773) was terminated by signal 6: Aborted
>2013-03-26 11:51:16 MSK 3770 @ from [vxid: txid:0] [] LOG:
>terminating any other active server processes
>
>From /var/log/messages:
>
>Mar 26 10:50:52 tms2 kernel: : postmaster: page allocation failure.
>order:8, mode:0xd0
>Mar 26 10:50:52 tms2 kernel: : Pid: 3774, comm: postmaster Not tainted
>2.6.32-279.22.1.el6.x86_64 #1
>Mar 26 10:50:52 tms2 kernel: : Call Trace:
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8112343f>] ?
>__alloc_pages_nodemask+0x77f/0x940
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8115d3e2>] ?
>kmem_getpages+0x62/0x170
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8115dffa>] ?
>fallback_alloc+0x1ba/0x270
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8115da4f>] ?
>cache_grow+0x2cf/0x320
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8115dd79>] ?
>____cache_alloc_node+0x99/0x160
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff813fd455>] ?
>dma_pin_iovec_pages+0xb5/0x230
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8115eb49>] ?
>__kmalloc+0x189/0x220
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff813fd455>] ?
>dma_pin_iovec_pages+0xb5/0x230
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8141a47c>] ?
>lock_sock_nested+0xac/0xc0
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8146edaa>] ?
>tcp_recvmsg+0x4ca/0xe80
>Mar 26 10:50:52 tms2 kernel: : [<ffffffffa029a67e>] ?
>xfs_vm_write_end+0x2e/0x60 [xfs]
>Mar 26 10:50:52 tms2 kernel: : [<ffffffffa0292f39>] ?
>xfs_trans_unlocked_item+0x39/0x60 [xfs]
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8148f90a>] ?
>inet_recvmsg+0x5a/0x90
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff81418b93>] ?
>sock_recvmsg+0x133/0x160
>Mar 26 10:50:52 tms2 kernel: : [<ffffffffa029f542>] ?
>xfs_rw_iunlock+0x32/0x40 [xfs]
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff81090be0>] ?
>autoremove_wake_function+0x0/0x40
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff81173fff>] ?
>__dentry_open+0x23f/0x360
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff811764ca>] ?
>do_sync_write+0xfa/0x140
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff811875d0>] ?
>do_filp_open+0x780/0xd60
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff81418d0e>] ?
>sys_recvfrom+0xee/0x180
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff81183105>] ? putname+0x35/0x50
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff81176842>] ?
>vfs_write+0x132/0x1a0
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff810d3f47>] ?
>audit_syscall_entry+0x1d7/0x200
>Mar 26 10:50:52 tms2 kernel: : [<ffffffff8100b072>] ?
>system_call_fastpath+0x16/0x1b
>
>After restart:
>
>2013-03-27 10:43:01 MSK 1535 @ from [vxid: txid:0] [] LOG: database
>system was interrupted while in recovery at log time 2013-03-26
>10:08:37 MSK
>2013-03-27 10:43:01 MSK 1535 @ from [vxid: txid:0] [] HINT: If this
>has occurred more than once some data might be corrupted and you might
>need to choose an earlier recovery target.
>2013-03-27 10:43:01 MSK 1535 @ from [vxid: txid:0] [] LOG: entering
>standby mode
>2013-03-27 10:43:01 MSK 1535 @ from [vxid:1/0 txid:0] [] LOG: redo
>starts at 2538/6E154AC0
>2013-03-27 10:43:02 MSK 1535 @ from [vxid:1/0 txid:0] [] LOG: file
>"pg_subtrans/10B7" doesn't exist, reading as zeroes
>2013-03-27 10:43:02 MSK 1535 @ from [vxid:1/0 txid:0] [] CONTEXT:
>xlog redo xid assignment xtop 280482186: subxacts: 280483328 280483333
>280483343 280483354 280483356 280483368 280483377 280483382 280483392
>280483404 280483416 280483429 280483440 280483451 280483460 280483472
>280483487 280483500 280483516 280483530 280483541 280483555 280483565
>280483574 280483585 280483595 280483604 280483607 280483617 280483626
>280483636 280483646 280483656 280483665 280483677 280483688 280483699
>280483709 280483719 280483730 280483739 280483749 280483759 280483761
>280483771 280483782 280483799 280483800 280483811 280483821 280483824
>280483836 280483847 280483859 280483871 280483874 280483883 280483897
>280483906 280483915 280483925 280483937 280483948 280483958
>2013-03-27 10:43:02 MSK 1535 @ from [vxid:1/0 txid:0] [] LOG: file
>"pg_subtrans/10B7" doesn't exist, reading as zeroes
>2013-03-27 10:43:02 MSK 1535 @ from [vxid:1/0 txid:0] [] CONTEXT:
>xlog redo xid assignment xtop 280482270: subxacts: 280485056 280485070
>280485083 280485086 280485098 280485113 280485132 280485144 280485156
>280485167 280485178 280485188 280485201 280485217 280485234 280485249
>280485267 280485293 280485309 280485327 280485333 280485345 280485353
>280485373 280485388 280485405 280485420 280485434 280485457 280485476
>280485482 280485507 280485516 280485531 280485537 280485550 280485565
>280485568 280485585 280485587 280485601 280485613 280485634 280485639
>280485656 280485669 280485684 280485690 280485693 280485712 280485730
>280485754 280485757 280485779 280485801 280485808 280485811 280485830
>280485856 280485880 280485900 280485920 280485941 280485946
>
>[ skipped several more messages of this kind]
>
>2013-03-27 10:43:03 MSK 1535 @ from [vxid:1/0 txid:0] [] LOG: file
>"pg_subtrans/10B8" doesn't exist, reading as zeroes
>2013-03-27 10:43:03 MSK 1535 @ from [vxid:1/0 txid:0] [] CONTEXT:
>xlog redo xid assignment xtop 280549936: subxacts: 280555514 280555515
>280555516 280555517 280555518 280555519 280555520 280555521 280555522
>280555523 280555524 280555525 280555526 280555527 280555528 280555529
>280555530 280555531 280555532 280555533 280555534 280555535 280555536
>280555537 280555538 280555539 280555540 280555541 280555542 280555543
>280555544 280555545 280555546 280555547 280555548 280555549 280555550
>280555551 280555552 280555553 280555554 280555555 280555556 280555557
>280555558 280555559 280555560 280555561 280555562 280555563 280555564
>280555565 280555566 280555567 280555568 280555569 280555570 280555571
>280555572 280555573 280555574 280555575 280555576 280555577
>2013-03-27 10:43:03 MSK 1535 @ from [vxid:1/0 txid:0] [] FATAL:
>could not access status of transaction 280557568
>2013-03-27 10:43:03 MSK 1535 @ from [vxid:1/0 txid:0] [] DETAIL:
>Could not read from file "pg_subtrans/10B8" at offset 253952: Success.
>2013-03-27 10:43:03 MSK 1535 @ from [vxid:1/0 txid:0] [] CONTEXT:
>xlog redo xid assignment xtop 280555981: subxacts: 280557520 280557521
>280557522 280557523 280557524 280557525 280557526 280557527 280557528
>280557529 280557530 280557531 280557532 280557533 280557534 280557535
>280557536 280557537 280557538 280557539 280557540 280557541 280557542
>280557543 280557544 280557545 280557546 280557547 280557548 280557549
>280557550 280557551 280557552 280557553 280557554 280557555 280557556
>280557557 280557558 280557559 280557560 280557561 280557562 280557563
>280557564 280557565 280557566 280557567 280557568 280557569 280557570
>280557571 280557572 280557573 280557574 280557575 280557576 280557577
>280557578 280557579 280557580 280557581 280557582 280557583
>2013-03-27 10:43:03 MSK 1532 @ from [vxid: txid:0] [] LOG: startup
>process (PID 1535) exited with exit code 1
>2013-03-27 10:43:03 MSK 1532 @ from [vxid: txid:0] [] LOG:
>terminating any other active server processes
>
>Thank you in advance.
>
>--
>Kind regards,
>Sergey Konoplev
>Database and Software Consultant
>
>Profile: http://www.linkedin.com/in/grayhemp
>Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988)
>888-1979
>Skype: gray-hemp
>Jabber: gray(dot)ru(at)gmail(dot)com

---
Please excuse brevity and formatting - I am writing this on my mobile phone.
---
Please excuse brevity and formatting - I am writing this on my mobile phone.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Sergey Konoplev 2013-03-29 21:19:59 Re: Completely broken replica after PANIC: WAL contains references to invalid pages
Previous Message Jeff Davis 2013-03-29 18:00:44 Re: Issue with range types and casts?