pg_stat_lwlocks view - lwlocks statistics, round 2

From: Satoshi Nagayasu <snaga(at)uptime(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Qi Huang <huangqiyx(at)hotmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: pg_stat_lwlocks view - lwlocks statistics, round 2
Date: 2012-10-13 14:05:00
Message-ID: 5079750C.6060900@uptime.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

I have fixed my previous patch for pg_stat_lwlocks view, and
as Josh commented, it now supports local and global (shared)
statistics in the same system view.

Local statistics means the counters are only effective in the
same session, and shared ones means the counters are shared within
the entire cluster.

Also the global statistics would be collected via pgstat collector
process like other statistics do.

Now, the global statistics struct has been splitted into two parts
for different use, for bgwriter stats and lwlock stats.

Therefore, calling pg_stat_reset_shared('bgwriter') or
pg_stat_reset_shared('lwlocks') would reset dedicated struct,
not entire PgStat_GlobalStats.

Comments and review are always welcome.

Regards,

------------------------------------------------------------------------------
postgres=# SELECT * FROM pg_stat_lwlocks;
lwlockid | local_calls | local_waits | local_time_ms | shared_calls |
shared_waits | shared_time_ms
----------+-------------+-------------+---------------+--------------+--------------+----------------
0 | 0 | 0 | 0 | 4268 |
0 | 0
1 | 43 | 0 | 0 | 387 |
0 | 0
2 | 0 | 0 | 0 | 19 |
0 | 0
3 | 0 | 0 | 0 | 28 |
0 | 0
4 | 3 | 0 | 0 | 315 |
0 | 0
5 | 0 | 0 | 0 | 24 |
0 | 0
6 | 1 | 0 | 0 | 76 |
0 | 0
7 | 0 | 0 | 0 | 16919 |
0 | 0
8 | 0 | 0 | 0 | 0 |
0 | 0
9 | 0 | 0 | 0 | 0 |
0 | 0
10 | 0 | 0 | 0 | 0 |
0 | 0
11 | 0 | 0 | 0 | 75 |
0 | 0
12 | 0 | 0 | 0 | 0 |
0 | 0
13 | 0 | 0 | 0 | 0 |
0 | 0
14 | 0 | 0 | 0 | 0 |
0 | 0
15 | 0 | 0 | 0 | 0 |
0 | 0
16 | 0 | 0 | 0 | 0 |
0 | 0
17 | 0 | 0 | 0 | 61451 |
6 | 0
18 | 0 | 0 | 0 | 0 |
0 | 0
19 | 0 | 0 | 0 | 0 |
0 | 0
20 | 0 | 0 | 0 | 0 |
0 | 0
21 | 1 | 0 | 0 | 9 |
0 | 0
22 | 0 | 0 | 0 | 0 |
0 | 0
23 | 0 | 0 | 0 | 0 |
0 | 0
24 | 0 | 0 | 0 | 1 |
0 | 0
25 | 0 | 0 | 0 | 0 |
0 | 0
26 | 2 | 0 | 0 | 18 |
0 | 0
27 | 0 | 0 | 0 | 0 |
0 | 0
28 | 0 | 0 | 0 | 0 |
0 | 0
29 | 0 | 0 | 0 | 0 |
0 | 0
30 | 0 | 0 | 0 | 0 |
0 | 0
31 | 0 | 0 | 0 | 0 |
0 | 0
32 | 0 | 0 | 0 | 0 |
0 | 0
33 | 4 | 0 | 0 | 207953 |
0 | 0
50 | 8 | 0 | 0 | 33388 |
0 | 0
67 | 0 | 0 | 0 | 0 |
0 | 0
(36 rows)

postgres=#
------------------------------------------------------------------------------

2012/06/26 21:11, Satoshi Nagayasu wrote:
> Hi all,
>
> I've modified the pg_stat_lwlocks patch to be able to work with
> the latest PostgreSQL Git code.
>
> This patch provides:
> pg_stat_lwlocks New system view to show lwlock statistics.
> pg_stat_get_lwlocks() New function to retrieve lwlock statistics.
> pg_stat_reset_lwlocks() New function to reset lwlock statistics.
>
> Please try it out.
>
> Regards,
>
> 2012/06/26 5:29, Satoshi Nagayasu wrote:
>> Hi all,
>>
>> I've been working on a new system view, pg_stat_lwlocks, to observe
>> LWLock, and just completed my 'proof-of-concept' code that can work
>> with version 9.1.
>>
>> Now, I'd like to know the possibility of this feature for future
>> release.
>>
>> With this patch, DBA can easily determine a bottleneck around lwlocks.
>> --------------------------------------------------
>> postgres=# SELECT * FROM pg_stat_lwlocks ORDER BY time_ms DESC LIMIT 10;
>> lwlockid | calls | waits | time_ms
>> ----------+--------+-------+---------
>> 49 | 193326 | 32096 | 23688
>> 8 | 3305 | 133 | 1335
>> 2 | 21 | 0 | 0
>> 4 | 135188 | 0 | 0
>> 5 | 57935 | 0 | 0
>> 6 | 141 | 0 | 0
>> 7 | 24580 | 1 | 0
>> 3 | 3282 | 0 | 0
>> 1 | 41 | 0 | 0
>> 9 | 3 | 0 | 0
>> (10 rows)
>>
>> postgres=#
>> --------------------------------------------------
>>
>> In this view,
>> 'lwlockid' column represents LWLockId used in the backends.
>> 'calls' represents how many times LWLockAcquire() was called.
>> 'waits' represents how many times LWLockAcquire() needed to wait
>> within it before lock acquisition.
>> 'time_ms' represents how long LWLockAcquire() totally waited on
>> a lwlock.
>>
>> And lwlocks that use a LWLockId range, such as BufMappingLock or
>> LockMgrLock, would be grouped and summed up in a single record.
>> For example, lwlockid 49 in the above view represents LockMgrLock
>> statistics.
>>
>> Now, I know there are some considerations.
>>
>> (1) Performance
>>
>> I've measured LWLock performance both with and without the patch,
>> and confirmed that this patch does not affect the LWLock perfomance
>> at all.
>>
>> pgbench scores with the patch:
>> tps = 900.906658 (excluding connections establishing)
>> tps = 908.528422 (excluding connections establishing)
>> tps = 903.900977 (excluding connections establishing)
>> tps = 910.470595 (excluding connections establishing)
>> tps = 909.685396 (excluding connections establishing)
>>
>> pgbench scores without the patch:
>> tps = 909.096785 (excluding connections establishing)
>> tps = 894.868712 (excluding connections establishing)
>> tps = 910.074669 (excluding connections establishing)
>> tps = 904.022770 (excluding connections establishing)
>> tps = 895.673830 (excluding connections establishing)
>>
>> Of course, this experiment was not I/O bound, and the cache hit ratio
>> was>99.9%.
>>
>> (2) Memory space
>>
>> In this patch, I added three new members to LWLock structure
>> as uint64 to collect statistics.
>>
>> It means that those members must be held in the shared memory,
>> but I'm not sure whether it's appropriate.
>>
>> I think another possible option is holding those statistics
>> values in local (backend) process memory, and send them through
>> the stat collector process (like other statistics values).
>>
>> (3) LWLock names (or labels)
>>
>> Now, pg_stat_lwlocks view shows LWLockId itself. But LWLockId is
>> not easy for DBA to determine actual lock type.
>>
>> So, I want to show LWLock names (or labels), like 'WALWriteLock'
>> or 'LockMgrLock', but how should I implement it?
>>
>> Any comments?
>>
>> Regards,
>
>

--
Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Uptime Technologies, LLC. http://www.uptime.jp

Attachment Content-Type Size
pg_stat_lwlocks_20121013.diff text/plain 25.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Satoshi Nagayasu 2012-10-13 14:34:12 Re: pg_stat_lwlocks view - lwlocks statistics, round 2
Previous Message Andrew Dunstan 2012-10-13 14:00:34 Re: Successor of MD5 authentication, let's use SCRAM