Re: Add Information during standby recovery conflicts

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: bdrouvot(at)amazon(dot)com, sawada(dot)mshk(at)gmail(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, masahiko(dot)sawada(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add Information during standby recovery conflicts
Date: 2020-12-15 03:04:35
Message-ID: 20201215.120435.225938652181905969.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 15 Dec 2020 02:00:21 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
> Thanks for the review! I'm thinking to wait half a day before
> commiting
> the patch just in the case someone may object the patch.

Sorry for coming late. I have looked only the latest thread so I
should be missing many things so please ignore if it was settled in
the past discussion.

It emits messages like the follows;

[40509:startup] LOG: recovery still waiting after 1021.431 ms: recovery conflict on lock
[40509:startup] DETAIL: Conflicting processes: 41171, 41194.
[40509:startup] CONTEXT: WAL redo at 0/3013158 for Standby/LOCK: xid 510 db 13609 rel 16384

IFAIK DETAIL usually shows ordinary sentences so the first word is
capitalized and ends with a period. But it is not a sentence so
following period looks odd. a searcheing the tree for errdetails
showed some anomalies.

src/backend/parser/parse_param.c errdetail("%s versus %s",
src/backend/jit/llvm/llvmjit_error.cpp errdetail("while in LLVM")));
src/backend/replication/logical/tablesync.c errdetail("The error was: %s", res->err)));
src/backend/tcop/postgres.c errdetail("prepare: %s", pstmt->plansource->query_string);
src/backend/tcop/postgres.c errdetail("abort reason: recovery conflict");

and one similar occurance:

src/backend/utils/adt/dbsize.c errdetail("Invalid size unit: \"%s\".", strptr),

I'm not sure, but it seems to me at least the period is unnecessary
here.

+ bool maybe_log_conflict =
+ (standbyWaitStart != 0 && !logged_recovery_conflict);

odd indentation.

+ /* Also, set the timer if necessary */
+ if (logging_timer)
+ {
+ timeouts[cnt].id = STANDBY_LOCK_TIMEOUT;
+ timeouts[cnt].type = TMPARAM_AFTER;
+ timeouts[cnt].delay_ms = DeadlockTimeout;
+ cnt++;
+ }

This doesn't consider spurious wakeup. I'm not sure it actually
happenes but we usually consider that. That is since before this
patch, but ProcWaitForSignal()'s comment says that:

> * As this uses the generic process latch the caller has to be robust against
> * unrelated wakeups: Always check that the desired state has occurred, and
> * wait again if not.

If we don't care of spurious wakeups, we should add such a comment.

+ bool maybe_log_conflict;
+ bool maybe_update_title;

Although it should be a matter of taste and I understand that the
"maybe" means that "that logging and changing of ps display may not
happen in this iteration" , that variables seem expressing
respectively "we should write log if the timeout for recovery conflict
expires" and "we should update title if 500ms elapsed". So they seem
*to me* better be just "log_conflict" and "update_title".

I feel the same with "maybe_log_conflict" in ProcSleep().

+ for recovery conflicts. This is useful in determining if recovery
+ conflicts prevent the recovery from applying WAL.

(I'm not confident on this) Isn't the sentence better be in past or
present continuous tense?

> BTW, attached is the POC patch that implements the idea discussed
> upthread;
> if log_recovery_conflict_waits is enabled, the startup process reports
> the log also after the recovery conflict was resolved and the startup
> process
> finished waiting for it. This patch needs to be applied after
> v11-0002-Log-the-standby-recovery-conflict-waits.patch is applied.

Ah. I was just writing a comment about that. I haven't looked it
closer but it looks good to me. By the way doesn't it contains a
simple fix of a comment for the base patch?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-12-15 03:08:22 Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped
Previous Message Zhihong Yu 2020-12-15 02:56:07 Re: postgres_fdw - cached connection leaks if the associated user mapping/foreign server is dropped