Re: adding wait_start column to pg_locks

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: adding wait_start column to pg_locks
Date: 2021-02-09 10:11:35
Message-ID: 40dfaa75-1058-e811-1f7c-4cf7203a3068@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/02/09 18:13, Fujii Masao wrote:
>
>
> On 2021/02/09 17:48, torikoshia wrote:
>> On 2021-02-05 18:49, Fujii Masao wrote:
>>> On 2021/02/05 0:03, torikoshia wrote:
>>>> On 2021-02-03 11:23, Fujii Masao wrote:
>>>>>> 64-bit fetches are not atomic on some platforms. So spinlock is necessary when updating "waitStart" without holding the partition lock? Also GetLockStatusData() needs spinlock when reading "waitStart"?
>>>>>
>>>>> Also it might be worth thinking to use 64-bit atomic operations like
>>>>> pg_atomic_read_u64(), for that.
>>>>
>>>> Thanks for your suggestion and advice!
>>>>
>>>> In the attached patch I used pg_atomic_read_u64() and pg_atomic_write_u64().
>>>>
>>>> waitStart is TimestampTz i.e., int64, but it seems pg_atomic_read_xxx and pg_atomic_write_xxx only supports unsigned int, so I cast the type.
>>>>
>>>> I may be using these functions not correctly, so if something is wrong, I would appreciate any comments.
>>>>
>>>>
>>>> About the documentation, since your suggestion seems better than v6, I used it as is.
>>>
>>> Thanks for updating the patch!
>>>
>>> +    if (pg_atomic_read_u64(&MyProc->waitStart) == 0)
>>> +        pg_atomic_write_u64(&MyProc->waitStart,
>>> +                            pg_atomic_read_u64((pg_atomic_uint64 *) &now));
>>>
>>> pg_atomic_read_u64() is really necessary? I think that
>>> "pg_atomic_write_u64(&MyProc->waitStart, now)" is enough.
>>>
>>> +        deadlockStart = get_timeout_start_time(DEADLOCK_TIMEOUT);
>>> +        pg_atomic_write_u64(&MyProc->waitStart,
>>> +                    pg_atomic_read_u64((pg_atomic_uint64 *) &deadlockStart));
>>>
>>> Same as above.
>>>
>>> +        /*
>>> +         * Record waitStart reusing the deadlock timeout timer.
>>> +         *
>>> +         * It would be ideal this can be synchronously done with updating
>>> +         * lock information. Howerver, since it gives performance impacts
>>> +         * to hold partitionLock longer time, we do it here asynchronously.
>>> +         */
>>>
>>> IMO it's better to comment why we reuse the deadlock timeout timer.
>>>
>>>      proc->waitStatus = waitStatus;
>>> +    pg_atomic_init_u64(&MyProc->waitStart, 0);
>>>
>>> pg_atomic_write_u64() should be used instead? Because waitStart can be
>>> accessed concurrently there.
>>>
>>> I updated the patch and addressed the above review comments. Patch attached.
>>> Barring any objection, I will commit this version.
>>
>> Thanks for modifying the patch!
>> I agree with your comments.
>>
>> BTW, I ran pgbench several times before and after applying
>> this patch.
>>
>> The environment is virtual machine(CentOS 8), so this is
>> just for reference, but there were no significant difference
>> in latency or tps(both are below 1%).
>
> Thanks for the test! I pushed the patch.

But I reverted the patch because buildfarm members rorqual and
prion don't like the patch. I'm trying to investigate the cause
of this failures.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rorqual&dt=2021-02-09%2009%3A20%3A10
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2021-02-09%2009%3A13%3A16

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-02-09 10:46:37 Re: pg_replication_origin_drop API potential race condition
Previous Message Amit Kapila 2021-02-09 09:38:39 Re: Single transaction in the tablesync worker?