Re: adding wait_start column to pg_locks

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: adding wait_start column to pg_locks
Date: 2021-01-25 14:44:56
Message-ID: 88c451c7-2729-4329-fbae-6ed7664adea1@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/01/22 18:11, Fujii Masao wrote:
>
>
> On 2021/01/22 14:37, torikoshia wrote:
>> On 2021-01-21 12:48, Fujii Masao wrote:
>>
>>> Thanks for updating the patch! I think that this is really useful feature!!
>>
>> Thanks for reviewing!
>>
>>> I have two minor comments.
>>>
>>> +      <entry role="catalog_table_entry"><para role="column_definition">
>>> +       <structfield>wait_start</structfield> <type>timestamptz</type>
>>>
>>> The column name "wait_start" should be "waitstart" for the sake of consistency
>>> with other column names in pg_locks? pg_locks seems to avoid including
>>> an underscore in column names, so "locktype" is used instead of "lock_type",
>>> "virtualtransaction" is used instead of "virtual_transaction", etc.
>>>
>>> +       Lock acquisition wait start time. <literal>NULL</literal> if
>>> +       lock acquired.
>>>
>>
>> Agreed.
>>
>> I also changed the variable name "wait_start" in struct PGPROC and
>> LockInstanceData to "waitStart" for the same reason.
>>
>>
>>> There seems the case where the wait start time is NULL even when "grant"
>>> is false. It's better to add note about that case into the docs? For example,
>>> I found that the wait start time is NULL while the startup process is waiting
>>> for the lock. Is this only that case?
>>
>> Thanks, this is because I set 'waitstart' in the following
>> condition.
>>
>>    ---src/backend/storage/lmgr/proc.c
>>    > 1250         if (!InHotStandby)
>>
>> As far as considering this, I guess startup process would
>> be the only case.
>>
>> I also think that in case of startup process, it seems possible
>> to set 'waitstart' in ResolveRecoveryConflictWithLock(), so I
>> did it in the attached patch.
>
> This change seems to cause "waitstart" to be reset every time
> ResolveRecoveryConflictWithLock() is called in the do-while loop.
> I guess this is not acceptable. Right?
>
> To avoid that issue, IMO the following change is better. Thought?
>
> -       else if (log_recovery_conflict_waits)
> +       else
>         {
> +               TimestampTz now = GetCurrentTimestamp();
> +
> +               MyProc->waitStart = now;
> +
>                 /*
>                  * Set the wait start timestamp if logging is enabled and in hot
>                  * standby.
>                  */
> -               standbyWaitStart = GetCurrentTimestamp();
> +                if (log_recovery_conflict_waits)
> +                        standbyWaitStart = now
>         }
>
> This change causes the startup process to call GetCurrentTimestamp()
> additionally even when log_recovery_conflict_waits is disabled. Which
> might decrease the performance of the startup process, but that performance
> degradation can happen only when the startup process waits in
> ACCESS EXCLUSIVE lock. So if this my understanding right, IMO it's almost
> harmless to call GetCurrentTimestamp() additionally in that case. Thought?

According to the off-list discussion with you, this should not happen because ResolveRecoveryConflictWithDatabase() sets MyProc->waitStart only when it's not set yet (i.e., = 0). That's good. So I'd withdraw my comment.

+ if (MyProc->waitStart == 0)
+ MyProc->waitStart = now;
<snip>
+ MyProc->waitStart = get_timeout_start_time(DEADLOCK_TIMEOUT);

Another comment is; Doesn't the change of MyProc->waitStart need the lock table's partition lock? If yes, we can do that by moving LWLockRelease(partitionLock) just after the change of MyProc->waitStart, but which causes the time that lwlock is being held to be long. So maybe we need another way to do that.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-01-25 15:03:01 Re: About to add WAL write/fsync statistics to pg_stat_wal view
Previous Message bucoo@sohu.com 2021-01-25 14:14:40 Re: Re: parallel distinct union and aggregate support patch