Re: Some problems of recovery conflict wait events

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Some problems of recovery conflict wait events
Date: 2020-04-03 03:28:03
Message-ID: d9a08121-053f-bbcb-3a6b-4f7da8a1dac3@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/04/02 16:12, Fujii Masao wrote:
>
>
> On 2020/04/02 15:54, Masahiko Sawada wrote:
>> On Thu, 2 Apr 2020 at 15:34, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>
>>>
>>>
>>> On 2020/04/02 14:25, Masahiko Sawada wrote:
>>>> On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2020/03/30 20:10, Masahiko Sawada wrote:
>>>>>> On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2020/03/04 14:31, Masahiko Sawada wrote:
>>>>>>>> On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/03/04 13:27, Michael Paquier wrote:
>>>>>>>>>> On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:
>>>>>>>>>>> Yeah, so 0001 patch sets existing wait events to recovery conflict
>>>>>>>>>>> resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
>>>>>>>>>>> to the recovery conflict on a snapshot. 0003 patch improves these wait
>>>>>>>>>>> events by adding the new type of wait event such as
>>>>>>>>>>> WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
>>>>>>>>>>> is the fix for existing versions and 0003 patch is an improvement for
>>>>>>>>>>> only PG13. Did you mean even 0001 patch doesn't fit for back-patching?
>>>>>>>>>
>>>>>>>>> Yes, it looks like a improvement rather than bug fix.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Okay, understand.
>>>>>>>>
>>>>>>>>>> I got my eyes on this patch set.  The full patch set is in my opinion
>>>>>>>>>> just a set of improvements, and not bug fixes, so I would refrain from
>>>>>>>>>> back-backpatching.
>>>>>>>>>
>>>>>>>>> I think that the issue (i.e., "waiting" is reported twice needlessly
>>>>>>>>> in PS display) that 0002 patch tries to fix is a bug. So it should be
>>>>>>>>> fixed even in the back branches.
>>>>>>>>
>>>>>>>> So we need only two patches: one fixes process title issue and another
>>>>>>>> improve wait event. I've attached updated patches.
>>>>>>>
>>>>>>> I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
>>>>>>>
>>>>>>> -       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
>>>>>>> +       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
>>>>>>>
>>>>>>> Currently the wait event indicating the wait for buffer pin has already
>>>>>>> been reported. But the above change in the patch changes the name of
>>>>>>> wait event for buffer pin only in the startup process. Is this really useful?
>>>>>>> Isn't the existing wait event for buffer pin enough?
>>>>>>>
>>>>>>> -       /* Wait to be signaled by the release of the Relation Lock */
>>>>>>> -       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>>>>> +               /* Wait to be signaled by the release of the Relation Lock */
>>>>>>> +               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
>>>>>>>
>>>>>>> Same as above. Isn't the existing wait event enough?
>>>>>>
>>>>>> Yeah, we can use the existing wait events for buffer pin and lock.
>>>>>>
>>>>>>>
>>>>>>> -       /*
>>>>>>> -        * Progressively increase the sleep times, but not to more than 1s, since
>>>>>>> -        * pg_usleep isn't interruptible on some platforms.
>>>>>>> -        */
>>>>>>> -       standbyWait_us *= 2;
>>>>>>> -       if (standbyWait_us > 1000000)
>>>>>>> -               standbyWait_us = 1000000;
>>>>>>> +       WaitLatch(MyLatch,
>>>>>>> +                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
>>>>>>> +                         STANDBY_WAIT_MS,
>>>>>>> +                         wait_event_info);
>>>>>>> +       ResetLatch(MyLatch);
>>>>>>>
>>>>>>> ResetLatch() should be called before WaitLatch()?
>>>>>>
>>>>>> Fixed.
>>>>>>
>>>>>>>
>>>>>>> Could you tell me why you dropped the "increase-sleep-times" logic?
>>>>>>
>>>>>> I thought we can remove it because WaitLatch is interruptible but my
>>>>>> observation was not correct. The waiting startup process is not
>>>>>> necessarily woken up by signal. I think it's still better to not wait
>>>>>> more than 1 sec even if it's an interruptible wait.
>>>>>
>>>>> So we don't need to use WaitLatch() there, i.e., it's ok to keep using
>>>>> pg_usleep()?
>>>>>
>>>>>> Attached patch fixes the above and introduces only two wait events of
>>>>>> conflict resolution: snapshot and tablespace.
>>>>>
>>>>> Many thanks for updating the patch!
>>>>>
>>>>> -       /* Wait to be signaled by the release of the Relation Lock */
>>>>> -       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>>> +               /* Wait to be signaled by the release of the Relation Lock */
>>>>> +               ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>>> +       }
>>>>>
>>>>> Is this change really valid? What happens if the latch is set during
>>>>> ResolveRecoveryConflictWithVirtualXIDs()?
>>>>> ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
>>>>> is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
>>>>
>>>> Thank you for reviewing the patch!
>>>>
>>>> You're right. It's better to keep using pg_usleep() and set the wait
>>>> event by pgstat_report_wait_start().
>>>>
>>>>>
>>>>> +               default:
>>>>> +                       event_name = "unknown wait event";
>>>>> +                       break;
>>>>>
>>>>> Seems this default case should be removed. Please see other
>>>>> pgstat_get_wait_xxx() function, so there is no such code.
>>>>>
>>>>>> I also removed the wait
>>>>>> event of conflict resolution of database since it's unlikely to become
>>>>>> a user-visible and a long sleep as we discussed before.
>>>>>
>>>>> Is it worth defining new wait event type RecoveryConflict only for
>>>>> those two events? ISTM that it's ok to use IPC event type here.
>>>>>
>>>>
>>>> I dropped a new wait even type and added them to IPC wait event type.
>>>>
>>>> I've attached the new version patch.
>>>
>>> Thanks for updating the patch! The patch looks good to me except
>>> the following mior things.
>>>
>>> +        <row>
>>> +         <entry><literal>RecoveryConflictSnapshot</literal></entry>
>>> +         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
>>> +        </row>
>>> +        <row>
>>> +         <entry><literal>RecoveryConflictTablespace</literal></entry>
>>> +         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
>>> +        </row>
>>>
>>> You need to increment the value of "morerows" in
>>> "<entry morerows="38"><literal>IPC</literal></entry>".
>>>
>>> The descriptions of those two events should be placed in alphabetical order
>>> for event name. That is, they should be placed above RecoveryPause.
>>>
>>> "vacuum cleanup" is better than "physical cleanup"?
>>
>> Agreed.
>>
>> I've attached the updated version patch.
>
> Thanks! Looks good to me. Barring any objection, I will commit this patch.

Pushed! Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-04-03 03:32:41 Re: WAL usage calculation patch
Previous Message Dilip Kumar 2020-04-03 03:24:50 Re: WAL usage calculation patch