pgstat_assert_is_up() can fail in walsender

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject: pgstat_assert_is_up() can fail in walsender
Date: 2021-10-19 13:14:59
Message-ID: CA+HiwqEpGF=ROEvVOqvvDF=w9iaMBx0g5zBBhP62ZFE7vW6O8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I can (almost) consistently reproduce $subject by executing the
attached shell script, which I was using while constructing a test
case for another thread.

The backtrace on the assert failure is this:

(gdb) bt
#0 0x00007fce0b018387 in raise () from /lib64/libc.so.6
#1 0x00007fce0b019a78 in abort () from /lib64/libc.so.6
#2 0x0000000000b0bdfc in ExceptionalCondition (conditionName=0xcc0828
"pgstat_is_initialized && !pgstat_is_shutdown",
errorType=0xcc01b0 "FailedAssertion", fileName=0xcbfe12
"pgstat.c", lineNumber=4852) at assert.c:69
#3 0x00000000008ac51e in pgstat_assert_is_up () at pgstat.c:4852
#4 0x00000000008a9623 in pgstat_send (msg=0x7ffd16db3240, len=144) at
pgstat.c:3075
#5 0x00000000008a7cbf in pgstat_report_replslot_drop (
slotname=0x7fce02dc6720 "pg_16399_sync_16389_7020757232905881693")
at pgstat.c:1869
#6 0x00000000008fb06b in ReplicationSlotDropPtr (slot=0x7fce02dc6708)
at slot.c:696
#7 0x00000000008fadbc in ReplicationSlotDropAcquired () at slot.c:585
#8 0x00000000008faa8a in ReplicationSlotRelease () at slot.c:482
#9 0x00000000009697c2 in ProcKill (code=1, arg=0) at proc.c:852
#10 0x0000000000940878 in shmem_exit (code=1) at ipc.c:272
#11 0x00000000009406a5 in proc_exit_prepare (code=1) at ipc.c:194
#12 0x00000000009405fc in proc_exit (code=1) at ipc.c:107
#13 0x0000000000b0c796 in errfinish (filename=0xce8525 "postgres.c",
lineno=3193,
funcname=0xcea370 <__func__.24551> "ProcessInterrupts") at elog.c:666
#14 0x0000000000976ce4 in ProcessInterrupts () at postgres.c:3191
#15 0x0000000000908023 in WalSndWaitForWal (loc=16785408) at walsender.c:1406
#16 0x0000000000906f58 in logical_read_xlog_page (state=0x191d9e0,
targetPagePtr=16777216, reqLen=8192,
targetRecPtr=22502048, cur_page=0x1929150 "") at walsender.c:821
#17 0x000000000059f450 in ReadPageInternal (state=0x191d9e0,
pageptr=22495232, reqLen=6840) at xlogreader.c:649
#18 0x000000000059ec2e in XLogReadRecord (state=0x191d9e0,
errormsg=0x7ffd16db3e68) at xlogreader.c:337
#19 0x00000000008d48fe in DecodingContextFindStartpoint
(ctx=0x191d620) at logical.c:606
#20 0x000000000090769a in CreateReplicationSlot (cmd=0x191c000) at
walsender.c:1038
#21 0x000000000090851f in exec_replication_command (
cmd_string=0x185f470 "CREATE_REPLICATION_SLOT
\"pg_16399_sync_16389_7020757232905881693\" LOGICAL pgoutput (SNAPSHOT
'use')") at walsender.c:1636
#22 0x00000000009783d1 in PostgresMain (dbname=0x18896e8 "postgres",
username=0x18896c8 "amit") at postgres.c:4493
#23 0x00000000008b3e1d in BackendRun (port=0x1880fb0) at postmaster.c:4560
#24 0x00000000008b37bd in BackendStartup (port=0x1880fb0) at postmaster.c:4288
#25 0x00000000008afd03 in ServerLoop () at postmaster.c:1801
#26 0x00000000008af5da in PostmasterMain (argc=5, argv=0x1859ba0) at
postmaster.c:1473
#27 0x00000000007b074b in main (argc=5, argv=0x1859ba0) at main.c:198

cc'ing Andres and Horiguchi-san as pgstat_assert_is_up() is added in
the recent commit ee3f8d3d3ae, though not sure if the problem is that
commit's fault. I wonder if it may be of the adjacent commit
fb2c5028e635.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
publish_via_root_problem.sh text/x-sh 1.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-10-19 13:23:35 Re: Refactoring pg_dump's getTables()
Previous Message Daniel Gustafsson 2021-10-19 13:12:43 Re: Refactoring pg_dump's getTables()