Re: Timeout failure in 019_replslot_limit.pl

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Timeout failure in 019_replslot_limit.pl
Date: 2021-09-20 12:12:39
Message-ID: YUh6t0fvOjkKwjUJ@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 18, 2021 at 05:19:04PM -0300, Alvaro Herrera wrote:
> Hmm, sounds a possibly useful idea to explore, but I would only do so if
> the other ideas prove fruitless, because it sounds like it'd have more
> moving parts. Can you please first test if the idea of sending the signal
> twice is enough?

This idea does not work. I got one failure after 5 tries.

> If that doesn't work, let's try Horiguchi-san's idea
> of using some `ps` flags to find the process.

Tried this one as well, to see the same failure. I was just looking
at the state of the test while it was querying pg_replication_slots
and that was the expected state after the WAL sender received SIGCONT:
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
toto 12663 0.0 0.0 5014468 3384 ?? Ss 8:30PM 0:00.00 postgres: primary3: walsender toto [local] streaming 0/720000
toto 12662 0.0 0.0 4753092 3936 ?? Ts 8:30PM 0:00.01 postgres: standby_3: walreceiver streaming 0/7000D8

The test gets the right PIDs, as the logs showed:
ok 17 - have walsender pid 12663
ok 18 - have walreceiver pid 12662

So it does not seem that this is not an issue with the signals.
Perhaps we'd better wait for a checkpoint to complete by for example
scanning the logs before running the query on pg_replication_slots to
make sure that the slot is invalidated?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-20 12:34:21 Re: Logical replication timeout problem
Previous Message Daniel Gustafsson 2021-09-20 12:10:45 Re: proposal: possibility to read dumped table's name from file