Re: 001_rep_changes.pl stalls

From: Noah Misch <noah(at)leadboat(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: masao(dot)fujii(at)oss(dot)nttdata(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: 001_rep_changes.pl stalls
Date: 2020-04-20 07:59:54
Message-ID: 20200420075954.GB1395671@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 20, 2020 at 04:15:40PM +0900, Kyotaro Horiguchi wrote:
> At Sat, 18 Apr 2020 00:01:42 -0700, Noah Misch <noah(at)leadboat(dot)com> wrote in
> > On Fri, Apr 17, 2020 at 05:06:29PM +0900, Kyotaro Horiguchi wrote:
> > > At Fri, 17 Apr 2020 17:00:15 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> > > > By the way, if latch is consumed in WalSndLoop, succeeding call to
> > > > WalSndWaitForWal cannot be woke-up by the latch-set. Doesn't that
> > > > cause missing wakeups? (in other words, overlooking of wakeup latch).
> > >
> > > - Since the only source other than timeout of walsender wakeup is latch,
> > > - we should avoid wasteful consuming of latch. (It is the same issue
> > > - with [1]).
> > >
> > > + Since walsender is wokeup by LSN advancement via latch, we should
> > > + avoid wasteful consuming of latch. (It is the same issue with [1]).
> > >
> > >
> > > > If wakeup signal is not remembered on walsender (like
> > > > InterruptPending), WalSndPhysical cannot enter a sleep with
> > > > confidence.
> >
> > No; per latch.h, "What must be avoided is placing any checks for asynchronous
> > events after WaitLatch and before ResetLatch, as that creates a race
> > condition." In other words, the thing to avoid is calling ResetLatch()
> > without next examining all pending work that a latch would signal. Each
> > walsender.c WaitLatch call does follow the rules.
>
> I didn't meant that, of course. I thought of more or less the same
> with moving the trigger from latch to signal then the handler sets a
> flag and SetLatch(). If we use bare latch, we should avoid false
> entering to sleep, which also makes thinks compolex.

I don't understand. If there's a defect, can you write a test case or
describe a sequence of events (e.g. at line X, variable Y has value Z)?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-04-20 08:01:55 Re: fixing old_snapshot_threshold's time->xid mapping
Previous Message Kyotaro Horiguchi 2020-04-20 07:46:56 Re: WAL usage calculation patch