Re: A bug in LWLOCK_STATS

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: A bug in LWLOCK_STATS
Date: 2020-02-06 05:49:44
Message-ID: 5a0286c8-f464-9f69-d2e0-c9ff05085fb4@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/02/05 17:13, Julien Rouhaud wrote:
> On Wed, Feb 05, 2020 at 02:43:49PM +0900, Fujii Masao wrote:
>> Hi,
>>
>> When I compiled PostgreSQL with -DLWLOCK_STATS and tried to check
>> the statistics of light-weight locks, I observed that more than one
>> statistics entries were output *for the same backend process and
>> the same lwlock*. For example, I got the following four statistics
>> when I checked how the process with PID 81141 processed ProcArrayLock.
>> This is strange, and IMO only one statistics should be output for
>> the same backend process and lwlock.
>>
>> $ grep "PID 81141 lwlock ProcArrayLock" data/log/postgresql-2020-02-05_141842.log
>> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 4000 exacq 0 blk 0 spindelay 0 dequeue self 0
>> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 2 exacq 0 blk 0 spindelay 0 dequeue self 0
>> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 6001 exacq 1 blk 0 spindelay 0 dequeue self 0
>> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 5 exacq 1 blk 0 spindelay 0 dequeue self 0
>>
>> The cause of this issue is that the key variable used for lwlock hash
>> table was not fully initialized. The key consists of two fields and
>> they are initialized as follows. But the following 4 bytes allocated
>> for the alignment was not initialized. So even if the same key was
>> specified, hash_search(HASH_ENTER) could not find the existing
>> entry for that key and created new one.
>>
>> key.tranche = lock->tranche;
>> key.instance = lock;
>>
>> Attached is the patch fixing this issue by initializing the key
>> variable with zero. In the patched version, I confirmed that only one
>> statistics is output for the same process and the same lwlock.
>> Also this patch would reduce the volume of lwlock statistics
>> very much.
>>
>> This issue was introduced by commit 3761fe3c20. So the patch needs
>> to be back-patch to v10.

Pushed.

> Good catch! The patch looks good to me. Just in case I looked at other users
> of HASH_BLOBS and AFAICT there's no other cases of key that can contain padding
> bytes that aren't memset first.

Thanks Julien and Horiguchi-san for reviewing the patch
and checking other cases!

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Surafel Temesgen 2020-02-06 05:53:29 Re: [PATCH v1] Allow COPY "text" format to output a header
Previous Message Masahiko Sawada 2020-02-06 05:32:40 Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager