From: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Some problems of recovery conflict wait events |
Date: | 2020-03-04 02:04:00 |
Message-ID: | d60fd913-7cfc-564e-62b6-3db3995a5e33@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020/02/29 12:36, Masahiko Sawada wrote:
> On Wed, 26 Feb 2020 at 16:19, Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>>
>> On Tue, 18 Feb 2020 at 17:58, Masahiko Sawada
>> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>>>
>>> Hi all,
>>>
>>> When recovery conflicts happen on the streaming replication standby,
>>> the wait event of startup process is null when
>>> max_standby_streaming_delay = 0 (to be exact, when the limit time
>>> calculated by max_standby_streaming_delay is behind the last WAL data
>>> receipt time is behind). Moreover the process title of waiting startup
>>> process looks odd in the case of lock conflicts.
>>>
>>> 1. When max_standby_streaming_delay > 0 and the startup process
>>> conflicts with a lock,
>>>
>>> * wait event
>>> backend_type | wait_event_type | wait_event
>>> --------------+-----------------+------------
>>> startup | Lock | relation
>>> (1 row)
>>>
>>> * ps
>>> 42513 ?? Ss 0:00.05 postgres: startup recovering
>>> 000000010000000000000003 waiting
>>>
>>> Looks good.
>>>
>>> 2. When max_standby_streaming_delay > 0 and the startup process
>>> conflicts with a snapshot,
>>>
>>> * wait event
>>> backend_type | wait_event_type | wait_event
>>> --------------+-----------------+------------
>>> startup | |
>>> (1 row)
>>>
>>> * ps
>>> 44299 ?? Ss 0:00.05 postgres: startup recovering
>>> 000000010000000000000003 waiting
>>>
>>> wait_event_type and wait_event are null in spite of waiting for
>>> conflict resolution.
>>>
>>> 3. When max_standby_streaming_delay > 0 and the startup process
>>> conflicts with a lock,
>>>
>>> * wait event
>>> backend_type | wait_event_type | wait_event
>>> --------------+-----------------+------------
>>> startup | |
>>> (1 row)
>>>
>>> * ps
>>> 46510 ?? Ss 0:00.05 postgres: startup recovering
>>> 000000010000000000000003 waiting waiting
>>>
>>> wait_event_type and wait_event are null and the process title is
>>> wrong; "waiting" appears twice.
>>>
>>> The cause of the first problem, wait_event_type and wait_event are not
>>> set, is that WaitExceedsMaxStandbyDelay which is called by
>>> ResolveRecoveryConflictWithVirtualXIDs waits for other transactions
>>> using pg_usleep rather than WaitLatch. I think we can change it so
>>> that it uses WaitLatch and those caller passes wait event information.
>>>
>>> For the second problem, wrong process title, the cause is also
>>> relevant with ResolveRecoveryConflictWithVirtualXIDs; in case of lock
>>> conflicts we add "waiting" to the process title in WaitOnLock but we
>>> add it again in ResolveRecoveryConflictWithVirtualXIDs. I think we can
>>> have WaitOnLock not set process title in recovery case.
>>>
>>> This problem exists on 12, 11 and 10. I'll submit the patch.
>>>
>>
>> I've attached patches that fix the above two issues.
>>
>> 0001 patch fixes the first problem. Currently there are 5 types of
>> recovery conflict resolution: snapshot, tablespace, lock, database and
>> buffer pin, and we set wait events to only 2 events out of 5: lock
>> (only when doing ProcWaitForSignal) and buffer pin.
+1 to add those new wait events in the master. But adding them sounds like
new feature rather than bug fix. So ISTM that it's not be back-patchable...
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2020-03-04 02:28:49 | Re: range_agg |
Previous Message | Peter Geoghegan | 2020-03-04 01:58:24 | Re: [PATCH] kNN for btree |