Re: Some problems of recovery conflict wait events

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Some problems of recovery conflict wait events
Date: 2020-04-02 06:34:17
Message-ID: 7f382d57-e640-6b29-806c-58ee533d7303@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/04/02 14:25, Masahiko Sawada wrote:
> On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>
>>
>>
>> On 2020/03/30 20:10, Masahiko Sawada wrote:
>>> On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>
>>>>
>>>>
>>>> On 2020/03/04 14:31, Masahiko Sawada wrote:
>>>>> On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2020/03/04 13:27, Michael Paquier wrote:
>>>>>>> On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:
>>>>>>>> Yeah, so 0001 patch sets existing wait events to recovery conflict
>>>>>>>> resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
>>>>>>>> to the recovery conflict on a snapshot. 0003 patch improves these wait
>>>>>>>> events by adding the new type of wait event such as
>>>>>>>> WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
>>>>>>>> is the fix for existing versions and 0003 patch is an improvement for
>>>>>>>> only PG13. Did you mean even 0001 patch doesn't fit for back-patching?
>>>>>>
>>>>>> Yes, it looks like a improvement rather than bug fix.
>>>>>>
>>>>>
>>>>> Okay, understand.
>>>>>
>>>>>>> I got my eyes on this patch set. The full patch set is in my opinion
>>>>>>> just a set of improvements, and not bug fixes, so I would refrain from
>>>>>>> back-backpatching.
>>>>>>
>>>>>> I think that the issue (i.e., "waiting" is reported twice needlessly
>>>>>> in PS display) that 0002 patch tries to fix is a bug. So it should be
>>>>>> fixed even in the back branches.
>>>>>
>>>>> So we need only two patches: one fixes process title issue and another
>>>>> improve wait event. I've attached updated patches.
>>>>
>>>> I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
>>>>
>>>> - ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
>>>> + ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
>>>>
>>>> Currently the wait event indicating the wait for buffer pin has already
>>>> been reported. But the above change in the patch changes the name of
>>>> wait event for buffer pin only in the startup process. Is this really useful?
>>>> Isn't the existing wait event for buffer pin enough?
>>>>
>>>> - /* Wait to be signaled by the release of the Relation Lock */
>>>> - ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>>>> + /* Wait to be signaled by the release of the Relation Lock */
>>>> + ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
>>>>
>>>> Same as above. Isn't the existing wait event enough?
>>>
>>> Yeah, we can use the existing wait events for buffer pin and lock.
>>>
>>>>
>>>> - /*
>>>> - * Progressively increase the sleep times, but not to more than 1s, since
>>>> - * pg_usleep isn't interruptible on some platforms.
>>>> - */
>>>> - standbyWait_us *= 2;
>>>> - if (standbyWait_us > 1000000)
>>>> - standbyWait_us = 1000000;
>>>> + WaitLatch(MyLatch,
>>>> + WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
>>>> + STANDBY_WAIT_MS,
>>>> + wait_event_info);
>>>> + ResetLatch(MyLatch);
>>>>
>>>> ResetLatch() should be called before WaitLatch()?
>>>
>>> Fixed.
>>>
>>>>
>>>> Could you tell me why you dropped the "increase-sleep-times" logic?
>>>
>>> I thought we can remove it because WaitLatch is interruptible but my
>>> observation was not correct. The waiting startup process is not
>>> necessarily woken up by signal. I think it's still better to not wait
>>> more than 1 sec even if it's an interruptible wait.
>>
>> So we don't need to use WaitLatch() there, i.e., it's ok to keep using
>> pg_usleep()?
>>
>>> Attached patch fixes the above and introduces only two wait events of
>>> conflict resolution: snapshot and tablespace.
>>
>> Many thanks for updating the patch!
>>
>> - /* Wait to be signaled by the release of the Relation Lock */
>> - ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>> + /* Wait to be signaled by the release of the Relation Lock */
>> + ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
>> + }
>>
>> Is this change really valid? What happens if the latch is set during
>> ResolveRecoveryConflictWithVirtualXIDs()?
>> ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
>> is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
>
> Thank you for reviewing the patch!
>
> You're right. It's better to keep using pg_usleep() and set the wait
> event by pgstat_report_wait_start().
>
>>
>> + default:
>> + event_name = "unknown wait event";
>> + break;
>>
>> Seems this default case should be removed. Please see other
>> pgstat_get_wait_xxx() function, so there is no such code.
>>
>>> I also removed the wait
>>> event of conflict resolution of database since it's unlikely to become
>>> a user-visible and a long sleep as we discussed before.
>>
>> Is it worth defining new wait event type RecoveryConflict only for
>> those two events? ISTM that it's ok to use IPC event type here.
>>
>
> I dropped a new wait even type and added them to IPC wait event type.
>
> I've attached the new version patch.

Thanks for updating the patch! The patch looks good to me except
the following mior things.

+ <row>
+ <entry><literal>RecoveryConflictSnapshot</literal></entry>
+ <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+ </row>
+ <row>
+ <entry><literal>RecoveryConflictTablespace</literal></entry>
+ <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+ </row>

You need to increment the value of "morerows" in
"<entry morerows="38"><literal>IPC</literal></entry>".

The descriptions of those two events should be placed in alphabetical order
for event name. That is, they should be placed above RecoveryPause.

"vacuum cleanup" is better than "physical cleanup"?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-04-02 06:34:32 Re: WAL usage calculation patch
Previous Message Julien Rouhaud 2020-04-02 06:29:31 Re: User Interface for WAL usage data