Re: [BUG] Panic due to incorrect missingContrecPtr after promotion

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
Date: 2022-08-05 21:28:16
Message-ID: 1577E3F5-E9A0-4E08-9B8D-8486572F354C@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Would you mind trying the second attached to abtain detailed log on
> your testing environment? With the patch, the modified TAP test yields
> the log lines like below.

I applied the logging patch to 13.7 ( attached is the backport ) and repro'd the
Issue.

I stripped out the relevant parts of the file. Let me know if this is
helpful.

postgresql.log.2022-08-05-17:2022-08-05 17:18:51 UTC::@:[359]:LOG: ### [S] @0/48A3400: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=0, SbyModeReq=1
postgresql.log.2022-08-05-17:2022-08-05 17:18:51 UTC::@:[359]:LOG: ### [F] @0/10000000: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=0, SbyModeReq=1
postgresql.log.2022-08-05-17:2022-08-05 17:22:21 UTC::@:[359]:LOG: ### [S] @0/10000060: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:38:00 UTC::@:[359]:LOG: ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:38:05 UTC::@:[359]:LOG: ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:38:06 UTC::@:[359]:LOG: ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:38:11 UTC::@:[359]:LOG: ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:38:14 UTC::@:[359]:LOG: ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:38:14 UTC::@:[359]:LOG: ### [S] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=0, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:50:13 UTC::@:[359]:LOG: ### [S] @6/B8000198: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=0, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:50:20 UTC::@:[359]:LOG: ### [A] @6/F3FFFF20: abort=(6/F3FFFF20)0/0, miss=(6/F4000000)0/0, SbyMode=0, SbyModeReq=1
postgresql.log.2022-08-05-18:2022-08-05 18:50:20 UTC::@:[359]:LOG: ### [S] @6/F4000030: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-20:2022-08-05 20:38:06 UTC::@:[359]:LOG: ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-20:2022-08-05 20:38:11 UTC::@:[359]:LOG: ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-20:2022-08-05 20:38:13 UTC::@:[359]:LOG: ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-20:2022-08-05 20:38:18 UTC::@:[359]:LOG: ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-20:2022-08-05 20:38:21 UTC::@:[359]:LOG: ### [F] @E/17A78298: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=1, SbyModeReq=1
postgresql.log.2022-08-05-20:2022-08-05 20:38:21 UTC::@:[359]:LOG: ### [S] @E/17A78298: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=0, SbyModeReq=1
....
.......
.........

2022-08-05 20:38:21 UTC::@:[359]:LOG: received promote request
2022-08-05 20:38:21 UTC::@:[359]:LOG: redo done at E/17A78270
2022-08-05 20:38:21 UTC::@:[359]:LOG: last completed transaction was at log time 2022-08-05 20:38:06.398382+00
recovering 000000020000000E00000005
000000020000000E00000005 archive /rdsdbdata/log/restore/pg-wal-archive.901.* is not yet downloaded, exiting restore script for now
2022-08-05 20:38:21 UTC::@:[359]:LOG: ### [S] @E/17A78298: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=0, SbyModeReq=1
recovering 00000003.history
2022-08-05 20:38:21 UTC::@:[359]:LOG: selected new timeline ID: 3
2022-08-05 20:38:21 UTC::@:[359]:LOG: archive recovery complete
recovering 00000002.history
2022-08-05 20:38:22 UTC::@:[373]:LOG: request to flush past end of generated WAL; request D/C7A425C0, currpos 6/F4000088
2022-08-05 20:38:22 UTC::@:[373]:CONTEXT: writing block 0 of relation base/14301/26678
2022-08-05 20:38:22 UTC::@:[373]:ERROR: xlog flush request D/C7A425C0 is not satisfied --- flushed only to 6/F4000088
2022-08-05 20:38:22 UTC::@:[373]:CONTEXT: writing block 0 of relation base/14301/26678
2022-08-05 20:38:22 UTC::@:[357]:LOG: database system is ready to accept connections
2022-08-05 20:38:23 UTC::@:[373]:LOG: checkpoint starting: immediate force wait wal time
2022-08-05 20:38:23 UTC::@:[373]:LOG: request to flush past end of generated WAL; request D/F5873CF0, currpos 6/F8000060
2022-08-05 20:38:23 UTC::@:[373]:PANIC: xlog flush request D/F5873CF0 is not satisfied --- flushed only to 6/F8000060
2022-08-05 20:38:23 UTC::@:[357]:LOG: checkpointer process (PID 373) was terminated by signal 6: Aborted
2022-08-05 20:38:23 UTC::@:[357]:LOG: terminating any other active server processes
/etc/rds/dbbin/pgscripts/rds_wal_archive: line 19: 14546 Quit (core dumped) usleep 50000
2022-08-05 20:38:23 UTC::@:[375]:FATAL: archive command failed with exit code 131
2022-08-05 20:38:23 UTC::@:[375]:DETAIL: The failed archive command was: /etc/rds/dbbin/pgscripts/rds_wal_archive pg_wal/00000003000000060000003D

Thanks,

Sami Imseih
Amazon Web Services

Attachment Content-Type Size
abortcont_additional_log_137.diff application/octet-stream 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-08-05 21:40:44 Draft back-branch release notes are up
Previous Message Andres Freund 2022-08-05 21:08:23 Re: Cleaning up historical portability baggage