Re: Review for GetWALAvailability()

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Review for GetWALAvailability()
Date: 2020-06-17 01:01:21
Message-ID: 20200617.100121.2112248049559508701.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 17 Jun 2020 00:46:38 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
> >> 2. InvalidateObsoleteReplicationSlots() uses the loop to scan
> >> replication
> >> slots array and uses the "for" loop in each scan. Also it calls
> >> ReplicationSlotAcquire() for each "for" loop cycle, and
> >> ReplicationSlotAcquire() uses another loop to scan replication slots
> >> array. I don't think this is good design.
> >>
> >> ISTM that we can get rid of ReplicationSlotAcquire()'s loop because
> >> InvalidateObsoleteReplicationSlots() already know the index of the
> >> slot
> >> that we want to find. The attached patch does that. Thought?
> > The inner loop is expected to run at most several times per
> > checkpoint, which won't be a serious problem. However, it is better if
> > we can get rid of that in a reasonable way.
> > The attached patch changes the behavior for SAB_Block. Before the
> > patch, it rescans from the first slot for the same name, but with the
> > patch it just rechecks the same slot. The only caller of the function
> > with SAB_Block is ReplicationSlotDrop and I don't come up with a case
> > where another slot with the same name is created at different place
> > before the condition variable fires. But I'm not sure the change is
> > completely safe.
>
> Yes, that change might not be safe. So I'm thinking another approach
> to
> fix the issues.
>
> > Maybe some assertion is needed?
> >
> >> 3. There is a corner case where the termination of walsender cleans up
> >> the temporary replication slot while
> >> InvalidateObsoleteReplicationSlots()
> >> is sleeping on ConditionVariableTimedSleep(). In this case,
> >> ReplicationSlotAcquire() is called in the subsequent cycle of the
> >> "for"
> >> loop, cannot find the slot and then emits ERROR message. This leads
> >> to the failure of checkpoint by the checkpointer.
> > Agreed.
> >
> >> To avoid this case, if SAB_Inquire is specified,
> >> ReplicationSlotAcquire()
> >> should return the special value instead of emitting ERROR even when
> >> it cannot find the slot. Also InvalidateObsoleteReplicationSlots()
> >> should
> >> handle that special returned value.
> > I thought the same thing hearing that.
>
> While reading InvalidateObsoleteReplicationSlots() code, I found
> another issue.
>
> ereport(LOG,
> (errmsg("terminating walsender %d
> because replication slot \"%s\" is too
> far behind",
> wspid,
> NameStr(slotname))));
> (void) kill(wspid, SIGTERM);
>
> wspid indicates the PID of process using the slot. That process can be
> a backend, for example, executing pg_replication_slot_advance().
> So "walsender" in the above log message is not always correct.

Agreed.

>
> int wspid = ReplicationSlotAcquire(NameStr(slotname),
> SAB_Inquire);
>
> Why do we need to call ReplicationSlotAcquire() here and mark the slot
> as
> used by the checkpointer? Isn't it enough to check directly the slot's
> active_pid, instead?
> Maybe ReplicationSlotAcquire() is necessary because
> ReplicationSlotRelease() is called later? If so, why do we need to
> call
> ReplicationSlotRelease()? ISTM that we don't need to do that if the
> slot's
> active_pid is zero. No?

My understanding of the reason is that we update a slot value here.
The restriction allows the owner of a slot to assume that all the slot
values don't voluntarily change.

slot.h:104
| * - Individual fields are protected by mutex where only the backend owning
| * the slot is authorized to update the fields from its own slot. The
| * backend owning the slot does not need to take this lock when reading its
| * own fields, while concurrent backends not owning this slot should take the
| * lock when reading this slot's data.

> If my understanding is right, I'd like to propose the attached patch.
> It introduces DeactivateReplicationSlot() and replace the "for" loop
> in InvalidateObsoleteReplicationSlots() with
> it. ReplicationSlotAcquire()
> and ReplicationSlotRelease() are no longer called there.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2020-06-17 01:15:50 Re: Improve planner cost estimations for alternative subplans
Previous Message David Zhang 2020-06-17 00:54:17 Add an option to allow run regression test for individual module on Windows build