Walsender may fail to send wal to the end.

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Walsender may fail to send wal to the end.
Date: 2021-03-26 09:20:14
Message-ID: 20210326.182014.298226099985413968.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, I happened to see a doubious behavior of walsender.

On a replication set with wal_keep_size/(segments) = 0, running the
following command on the primary causes walsender to fail to send up
to the final shutdown checkpoint record to the standby.

(create table t in advance)

psql -c 'insert into t values(0); select pg_switch_wal();'; pg_ctl stop

The primary complains like this:

2021-03-26 17:59:29.324 JST [checkpointer][140697] LOG: shutting down
2021-03-26 17:59:29.387 JST [walsender][140816] ERROR: requested WAL segment 000000010000000000000032 has already been removed
2021-03-26 17:59:29.387 JST [walsender][140816] STATEMENT: START_REPLICATION 0/32000000 TIMELINE 1
2021-03-26 17:59:29.394 JST [postmaster][140695] LOG: database system is shut down

This is because XLogSendPhysical detects removal of the wal segment
currently reading by shutdown checkpoint. However, there' no fear of
overwriting of WAL segments at the time.

So I think we can omit the call to CheckXLogRemoved() while
MyWalSnd->state is WALSNDSTTE_STOPPING because the state comes after
the shutdown checkpoint completes.

Of course that doesn't help if walsender was running two segments
behind. There still could be a small window for the failure. But it's
a great help to save the case of just 1 segment behind.

Is it worth fixing?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
phys_walsnd_reads_removed_segs_while_shutdown.patch text/x-patch 1.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Denis Hirn 2021-03-26 09:22:06 Re: [PATCH] Allow multiple recursive self-references
Previous Message Markus Wanner 2021-03-26 09:12:54 Re: [PATCH] add concurrent_abort callback for output plugin