001_rep_changes.pl stalls

From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: 001_rep_changes.pl stalls
Date: 2020-04-06 06:36:49
Message-ID: 20200406063649.GA3738151@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Executive summary: the "MyWalSnd->write < sentPtr" in WalSndWaitForWal() is
important for promptly updating pg_stat_replication. When caught up, we
should impose that logic before every sleep. The one-line fix is to sleep in
WalSndLoop() only when pq_is_send_pending(), not when caught up.

On my regular development machine, src/test/subscription/t/001_rep_changes.pl
stalls for ~10s at this wait_for_catchup:

$node_publisher->safe_psql('postgres', "DELETE FROM tab_rep");

# Restart the publisher and check the state of the subscriber which
# should be in a streaming state after catching up.
$node_publisher->stop('fast');
$node_publisher->start;

$node_publisher->wait_for_catchup('tap_sub');

That snippet emits three notable physical WAL records. There's a
Transaction/COMMIT at the end of the DELETE, an XLOG/CHECKPOINT_SHUTDOWN, and
an XLOG/FPI_FOR_HINT.

The buildfarm has stalled there, but it happens probably less than half the
time. Examples[1] showing the stall:

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mandrill&dt=2020-03-20%2017%3A09%3A53&stg=subscription-check
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=thorntail&dt=2020-03-22%2019%3A51%3A38&stg=subscription-check
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hoverfly&dt=2020-03-19%2003%3A35%3A01&stg=subscription-check
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hoverfly&dt=2020-03-20%2015%3A15%3A01&stg=subscription-check

Here's the most-relevant walsender call tree:

WalSndLoop
XLogSendLogical (caller invokes once per loop iteration, via send_data callback)
XLogReadRecord (caller invokes once)
ReadPageInternal (caller invokes twice in this test; more calls are possible)
logical_read_xlog_page (caller skips when page is same as last call, else invokes 1-2 times via state->read_page() callback, registered in StartLogicalReplication)
WalSndWaitForWal (caller invokes once; has fast path)

The cause is a race involving the flow of reply messages (send_feedback()
messages) from logical apply worker to walsender. Here are two sequencing
patterns; the more-indented parts are what differ. Stalling pattern:

sender reads Transaction/COMMIT and sends the changes
receiver applies the changes
receiver send_feedback() reports progress up to Transaction/COMMIT
sender accepts the report
sender reads XLOG/CHECKPOINT_SHUTDOWN and/or XLOG/FPI_FOR_HINT, which are no-ops for logical rep
sender WalSndCaughtUp becomes true; sender sleeps in WalSndLoop()
receiver wal_receiver_status_interval elapses; receiver reports progress up to Transaction/COMMIT
sender wakes up, accepts the report
sender calls WalSndWaitForWal(), which sends a keepalive due to "MyWalSnd->write < sentPtr"
receiver gets keepalive, send_feedback() reports progress up to XLOG/FPI_FOR_HINT

Non-stalling pattern (more prevalent with lower machine performance):

sender reads Transaction/COMMIT and sends the changes
sender reads XLOG/CHECKPOINT_SHUTDOWN and/or XLOG/FPI_FOR_HINT, which are no-ops for logical rep
sender WalSndCaughtUp becomes true; sender sleeps in WalSndLoop()
receiver applies the changes
receiver send_feedback() reports progress up to Transaction/COMMIT
sender wakes up, accepts the report
sender calls WalSndWaitForWal(), which sends a keepalive due to "MyWalSnd->write < sentPtr"
receiver gets keepalive, send_feedback() reports progress up to XLOG/FPI_FOR_HINT

The fix is to test "MyWalSnd->write < sentPtr" before more sleeps. The test
is unnecessary when sleeping due to pq_is_send_pending(); in that case, the
receiver is not idle and will reply before idling. I changed WalSndLoop() to
sleep only for pq_is_send_pending(). For all other sleep reasons, the sleep
will happen in WalSndWaitForWal(). Attached. I don't know whether this is
important outside of testing scenarios. I lean against back-patching, but I
will back-patch if someone thinks this qualifies as a performance bug.

Thanks,
nm

[1] I spot-checked only my animals, since I wanted to experiment on an
affected animal.

Attachment Content-Type Size
WalSndLoop-stall-v1.patch text/plain 2.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-04-06 06:58:30 Re: Protocol problem with GSSAPI encryption?
Previous Message Masahiko Sawada 2020-04-06 05:48:39 Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)