Re: A new function to wait for the backend exit after termination

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bharath(dot)rupireddyforpostgres(at)gmail(dot)com
Cc: magnus(at)hagander(dot)net, masao(dot)fujii(at)oss(dot)nttdata(dot)com, houzj(dot)fnst(at)cn(dot)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, m(dot)usama(at)gmail(dot)com
Subject: Re: A new function to wait for the backend exit after termination
Date: 2021-03-17 02:58:47
Message-ID: 20210317.115847.2164719631356728443.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 17 Mar 2021 07:01:39 +0530, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote in
> Attaching v10 patch for further review.

The time-out mechanism doesn't count remainingtime as expected,
concretely it does the following.

do {
kill();
WaitLatch(WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, waittime);
ResetLatch(MyLatch);
remainingtime -= waittime;
} while (remainingtime > 0);

So, the WaitLatch doesn't consume as much time as the set waittime in
case of latch set. remainingtime reduces faster than the real at the
iteration.

It wouldn't happen actually but I concern about PID recycling. We can
make sure to get rid of the fear by checking for our BEENTRY instead
of PID. However, it seems to me that some additional function is
needed in pgstat.c so that we can check the realtime value of
PgBackendStatus, which might be too much.

+ /* If asked to wait, check whether the timeout value is valid or not. */
+ if (wait && pid != MyProcPid)
+ {
+ timeout = PG_GETARG_INT64(2);
+
+ if (timeout <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("\"timeout\" must not be negative or zero")));

This means that pg_terminate_backend accepts negative timeouts when
terminating myself, which looks odd.

Is there any reason to reject 0 as timeout?

+ * Wait only if requested and the termination is successful. Self
+ * termination is allowed but waiting is not.
+ */
+ if (wait && pid != MyProcPid && result)
+ result = pg_wait_until_termination(pid, timeout);

Why don't we wait for myself to be terminated? There's no guarantee
that myself will be terminated without failure. (I agree that that is
not so useful, but I think there's no reason not to do so.)

The first suggested signature for pg_terminate_backend() with timeout
was pg_terminate_backend(pid, timeout). The current signature (pid,
wait?, timeout) looks redundant. Maybe the reason for rejecting 0
astimeout is pg_terminate_backend(pid, true, 0) looks odd but it we
can wait forever in that case (as other features does). On the other
hand pg_terminate_backend(pid, false, 100) is apparently odd but this
patch doesn't seem to reject it. If there's no considerable reason
for the current signature, I would suggest that:

pg_terminate_backend(pid, timeout), where it waits forever if timeout
is zero and waits for the timeout if positive. Negative values are not
accepted.

That being said, I didn't find the disucssion about allowing default
timeout value by separating the boolean, if it is the consensus on
this thread, sorry for the noise.

+ ereport(WARNING,
+ (errmsg("could not check the existence of the backend with PID %d: %m",
+ pid)));
+ return false;

I think this is worth ERROR. We can avoid this handling if we look
into PgBackendEntry instead.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2021-03-17 03:00:36 Re: Getting better results from valgrind leak tracking
Previous Message vignesh C 2021-03-17 02:37:44 Re: [HACKERS] logical decoding of two-phase transactions