Re: Add Information during standby recovery conflicts

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add Information during standby recovery conflicts
Date: 2020-11-25 13:20:02
Message-ID: CAD21AoDM-KW8aKDUBjF6H7W8_rddMVP24wMxzDX6LzJ1SNyP5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 20, 2020 at 6:18 PM Drouvot, Bertrand <bdrouvot(at)amazon(dot)com> wrote:
>
> Hi,
>
> On 11/17/20 4:44 PM, Fujii Masao wrote:
> >
> > Thanks for updating the patch! Here are review comments.
> >
> > + Controls whether a log message is produced when the startup
> > process
> > + is waiting longer than <varname>deadlock_timeout</varname>
> > + for recovery conflicts.
> >
> > But a log message can be produced also when the backend is waiting
> > for recovery conflict. Right? If yes, this description needs to be
> > corrected.
>
> Thanks for looking at it!
>
> I don't think so, only the startup process should write those new log
> messages.
>
> What makes you think that would not be the case?
>
> >
> >
> > + for recovery conflicts. This is useful in determining if
> > recovery
> > + conflicts prevents the recovery from applying WAL.
> >
> > "prevents" should be "prevent"?
>
> Indeed: fixed in the new attached patch.
>
> >
> >
> > + TimestampDifference(waitStart, GetCurrentTimestamp(), &secs,
> > &usecs);
> > + msecs = secs * 1000 + usecs / 1000;
> >
> > GetCurrentTimestamp() is basically called before LogRecoveryConflict()
> > is called. So isn't it better to avoid calling GetCurrentTimestamp()
> > newly in
> > LogRecoveryConflict() and to reuse the timestamp that we got?
> > It's helpful to avoid the waste of cycles.
> >
> good catch! fixed in the new attached patch.
>
> >
> > + while (VirtualTransactionIdIsValid(*vxids))
> > + {
> > + PGPROC *proc =
> > BackendIdGetProc(vxids->backendId);
> >
> > BackendIdGetProc() can return NULL if the backend is not active
> > at that moment. This case should be handled.
> >
> handled in the new attached patch.
> >
> > + case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
> > + reasonDesc = gettext_noop("recovery is still
> > waiting recovery conflict on buffer pin");
> >
> > It's natural to use "waiting for recovery" rather than "waiting
> > recovery"?
> >
> I would be tempted to say so, the new patch makes use of "waiting for".
> >
> > + /* Also, set deadlock timeout for logging purpose if
> > necessary */
> > + if (log_recovery_conflict_waits)
> > + {
> > + timeouts[cnt].id = STANDBY_TIMEOUT;
> > + timeouts[cnt].type = TMPARAM_AFTER;
> > + timeouts[cnt].delay_ms = DeadlockTimeout;
> > + cnt++;
> > + }
> >
> > This needs to be executed only when the message has not been logged yet.
> > Right?
> >
> good catch: fixed in the new attached patch.
>

Thank you for updating the patch! Here are review comments on the
latest version patch.

+ while (VirtualTransactionIdIsValid(*vxids))
+ {
+ PGPROC *proc = BackendIdGetProc(vxids->backendId);
+
+ if (proc)
+ {
+ if (nprocs == 0)
+ appendStringInfo(&buf, "%d", proc->pid);
+ else
+ appendStringInfo(&buf, ", %d", proc->pid);
+
+ nprocs++;
+ vxids++;
+ }
+ }

We need to increment vxids even if *proc is null. Otherwise, the loop won't end.

---
+ TimestampTz cur_ts = GetCurrentTimestamp();;

There is an extra semi-colon.

---
int max_standby_streaming_delay = 30 * 1000;
+bool log_recovery_conflict_waits = false;
+bool logged_lock_conflict = false;

+ if (log_recovery_conflict_waits && !logged_lock_conflict)
+ {
+ timeouts[cnt].id = STANDBY_TIMEOUT;
+ timeouts[cnt].type = TMPARAM_AFTER;
+ timeouts[cnt].delay_ms = DeadlockTimeout;
+ cnt++;
+ }

Can we pass a bool indicating if a timeout may be needed for recovery
conflict logging from ProcSleep() to ResolveRecoveryConflictWithLock()
instead of using a static variable?

Regards,

--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2020-11-25 13:20:42 Re: Online checksums patch - once again
Previous Message 曾文旌 2020-11-25 13:08:04 Re: [Proposal] Global temporary tables