Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
Date: 2022-01-16 11:00:00
Message-ID: 68e3771a-2842-14bb-b115-bbd9702b7cbd@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Andres,
15.01.2022 01:44, Andres Freund wrote:
> What about instead giving WalReceiverConn an internal WaitEventSet, and using
> that consistently? I've attached a draft for that.
>
> Alexander, could you test with that patch applied?
Unfortunately, this patch doesn't fix the issue. The test
commit_ts/002_standby still fails (on iterations 3, 1, 8 for me).
With the debugging logging added:
...
elog(LOG, "libpqrcv_wait() before WaitEventSetWait(%p)", conn->wes);
    nevents = WaitEventSetWait(conn->wes, timeout, &event, 1,
wait_event_info);
elog(LOG, "libpqrcv_wait() after WaitEventSetWait");
...
elog(LOG, "WaitEventSetWaitBlock before WaitForMultipleObjects(%p)",
set->handles);
    rc = WaitForMultipleObjects(set->nevents + 1, set->handles, FALSE,
                                cur_timeout);
elog(LOG, "WaitEventSetWaitBlock aftet WaitForMultipleObjects");
...
and so on.

I see in 002_standby_standby.log
...
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_receive PQgetCopyData
returned: 145
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_receive PQgetCopyData
returned: 0
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_wait() before
WaitEventSetWait(000000000063ABA8)
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock(000000000063ABA8)
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects(000000000063AC30)
2022-01-16 13:31:36.244 MSK [2820] LOG:  WaitEventSetWaitBlock aftet
WaitForMultipleObjects
2022-01-16 13:31:36.244 MSK [2820] LOG:  WaitEventSetWait after
WaitEventSetWaitBlock
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWaitBlock aftet
WaitForMultipleObjects
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWait after
WaitEventSetWaitBlock
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_wait() after
WaitEventSetWait
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_receive PQgetCopyData
returned: 0
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_wait() before
WaitEventSetWait(000000000063ABA8)
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock(000000000063ABA8)
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects(000000000063AC30)
2022-01-16 13:31:36.244 MSK [2820] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock(0000000000649FB8)
2022-01-16 13:31:36.244 MSK [2820] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects(000000000064A020)
2022-01-16 13:31:36.247 MSK [1336] LOG:  WaitEventSetWaitBlock aftet
WaitForMultipleObjects
2022-01-16 13:31:36.247 MSK [1336] LOG:  WaitEventSetWait after
WaitEventSetWaitBlock
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_wait() after
WaitEventSetWait
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_receive PQgetCopyData
returned: -1
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_receive before
libpqrcv_PQgetResult(1)
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_receive
libpqrcv_PQgetResult(1) returned 0000000000692400
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_receive before
libpqrcv_PQgetResult(2)
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_wait() before
WaitEventSetWait(000000000063ABA8)
2022-01-16 13:31:36.247 MSK [1336] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock(000000000063ABA8)
2022-01-16 13:31:36.247 MSK [1336] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects(000000000063AC30)
2022-01-16 13:31:36.368 MSK [984] LOG:  WaitEventSetWaitBlock aftet
WaitForMultipleObjects
2022-01-16 13:31:36.368 MSK [984] LOG:  WaitEventSetWait after
WaitEventSetWaitBlock
...
After that, the process 1336 hangs till shutdown.

Best regards,
Alexander

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2022-01-16 11:35:56 Re: pg_basebackup WAL streamer shutdown is bogus - leading to slow tests
Previous Message Julien Rouhaud 2022-01-16 09:39:38 Re: Isn't wait_for_catchup slightly broken?