Re: RFC: replace pg_stat_activity.waiting with something more descriptive

From: Thom Brown <thom(at)linux(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Vladimir Borodin <root(at)simply(dot)name>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date: 2016-03-04 13:42:12
Message-ID: CAA-aLv6h6cLb8zNzWzuPEobO5R23=VXKd_28T4Kh0y2YvYA4FQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 4 March 2016 at 13:35, Thom Brown <thom(at)linux(dot)com> wrote:
> On 4 March 2016 at 04:05, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Wed, Feb 24, 2016 at 7:14 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>
>>> On Mon, Feb 22, 2016 at 10:05 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>>> wrote:
>>> >> I wouldn't bother tinkering with it at this point. The value isn't
>>> >> going to be recorded on disk anywhere, so it will be easy to change
>>> >> the way it's computed in the future if we ever need to do that.
>>> >>
>>> >
>>> > Okay. Find the rebased patch attached with this mail. I have moved
>>> > this patch to upcoming CF.
>>>
>>> I would call the functions pgstat_report_wait_start() and
>>> pgstat_report_wait_end() instead of pgstat_report_start_waiting() and
>>> pgstat_report_end_waiting().
>>>
>>
>> Changed as per suggestion and made these functions inline.
>>
>>> I think pgstat_get_wait_event_type should not return HWLock, a term
>>> that appears nowhere in our source tree at present. How about just
>>> "Lock"?
>>>
>>
>> Changed as per suggestion.
>>
>>> I think the wait event types should be documented - and the wait
>>> events too, perhaps.
>>>
>>
>> As discussed upthread, I have added documentation for all the possible wait
>> events and an example. Some of the LWLocks like AsyncQueueLock and
>> AsyncCtlLock are used for quite similar purpose, so I have kept there
>> explanation as same.
>>
>>> Maybe it's worth having separate wait event type names for lwlocks and
>>> lwlock tranches. We could report LWLockNamed and LWLockTranche and
>>> document the difference: "LWLockNamed indicates that the backend is
>>> waiting for a specific, named LWLock. The event name is the name of
>>> that lock. LWLockTranche indicates that the backend is waiting for
>>> any one of a group of locks with similar function. The event name
>>> identifies the general purpose of locks in that group."
>>>
>>
>> Changed as per suggestion.
>>
>>> There's no requirement that every session have every tranche
>>> registered. I think we should consider displaying "extension" for any
>>> tranche that's not built-in, or at least for tranches that are not
>>> registered (rather than "unknown wait event").
>>>
>>> + if (lock->tranche == 0 && lockId < NUM_INDIVIDUAL_LWLOCKS)
>>>
>>> Isn't the second part automatically true at this point?
>>>
>>
>> Fixed.
>>
>>> The changes to LockBufferForCleanup() don't look right to me. Waiting
>>> for a buffer pin might be a reasonable thing to define as a wait
>>> event, but it shouldn't reported as if we were waiting on the LWLock
>>> itself.
>>>
>>
>> As discussed upthread, added a new wait event BufferPin for this case.
>>
>>> What happens if an error is thrown while we're in a wait?
>>>
>>
>> As discussed upthread, added in AbortTransaction and from where ever
>> LWLockReleaseAll() gets called, point to note is that we can call this
>> function only when we are sure there is no further possibility of wait on
>> LWLock.
>>
>>> Does this patch hurt performance?
>>>
>>
>> Performance tests are underway.
>
> I've attached a revised version of the patch with the following corrections:
>
> + <para>
> + <literal>LWLockTranche</>: The backend is waiting for any one of a
> + group of locks with similar function. The <literal>wait_event</>
> + name for this type of wait identifies the general purpose of locks
> + in that group.
> + </para>
>
> s/with similar/with a similar/
>
> + <row>
> + <entry><literal>ControlFileLock</></entry>
> + <entry>A server process is waiting to read or update the control file
> + or creation of a new WAL log file.</entry>
> + </row>
>
> As the L in WAL stands for "log" anyway, I think the extra "log" word
> can be removed.
>
> + <row>
> + <entry><literal>RelCacheInitLock</></entry>
> + <entry>A server process is waiting to read or write to relation cache
> + initialization file.</entry>
> + </row>
>
> s/to relation/to the relation/
>
> + <row>
> + <entry><literal>BtreeVacuumLock</></entry>
> + <entry>A server process is waiting to read or update vacuum related
> + information for Btree index.</entry>
> + </row>
>
> s/vacuum related/vacuum-related/
> s/for Btree/for a Btree/
>
> + <row>
> + <entry><literal>AutovacuumLock</></entry>
> + <entry>A server process which could be autovacuum worker is waiting to
> + update or read current state of autovacuum workers.</entry>
> + </row>
>
> s/could be autovacuum/could be that an autovacuum/
> s/read current/read the current/
>
> (discussed with Amit offline about other sources of wait, and he
> suggested autovacuum launcher, so I've added that in too)
>
> + <row>
> + <entry><literal>AutovacuumScheduleLock</></entry>
> + <entry>A server process is waiting on another process to ensure that
> + the table it has selected for vacuum still needs vacuum.
> + </entry>
> + </row>
>
> s/for vacuum/for a vacuum/
> s/still needs vacuum/still needs vacuuming/
>
> + <row>
> + <entry><literal>SyncScanLock</></entry>
> + <entry>A server process is waiting to get the start location of scan
> + on table for synchronized scans.</entry>
> + </row>
>
> s/of scan/of a scan/
> s/on table/on a table/
>
> + <row>
> + <entry><literal>SerializableFinishedListLock</></entry>
> + <entry>A server process is waiting to access list of finished
> + serializable transactions.</entry>
> + </row>
>
> s/to access list/to access the list/
>
> + <row>
> + <entry><literal>SerializablePredicateLockListLock</></entry>
> + <entry>A server process is waiting to perform operation on list of
> + locks held by serializable transactions.</entry>
> + </row>
>
> s/perform operation/perform an operation/
> s/on list/on a list/
>
> + <row>
> + <entry><literal>AutoFileLock</></entry>
> + <entry>A server process is waiting to update
> <filename>postgresql.auto.conf</>
> + file.</entry>
> + </row>
>
> s/to update/to update the/
>
> + <row>
> + <entry><literal>CommitTsLock</></entry>
> + <entry>A server process is waiting to read or update the last value
> + set for transaction timestamp.</entry>
> + </row>
>
> s/for transaction/for the transaction/
>
> + <row>
> + <entry><literal>clog</></entry>
> + <entry>A server process is waiting on any one of the clog buffer locks
> + to read or write the clog page in pg_clog subdirectory.</entry>
> + </row>
>
> s/page in/page in the/
>
> The 6 rows that follow that one could apply this same correction.
>
> + <row>
> + <entry><literal>wal_insert</></entry>
> + <entry>A server process is waiting on any one of the wal_insert locks
> + to write data in wal buffers.</entry>
> + </row>
>
> s/data in/data into/
>
> + <row>
> + <entry><literal>buffer_content</></entry>
> + <entry>A server process is waiting on any one of the buffer_content
> + locks to read or write data in shared buffers.</entry>
> + </row>
>
> s/data in/data into/
>
> + <row>
> + <entry><literal>buffer_io</></entry>
> + <entry>A server process is waiting on any one of the buffer_io locks
> + to allow other process to complete the I/O on buffer.</entry>
> + </row>
>
> s/allow other process/allow another process/
>
> + <row>
> + <entry><literal>buffer_mapping</></entry>
> + <entry>A server process is waiting on any one of the buffer_mapping
> + locks to associate a data block with buffer in buffer pool.</entry>
> + </row>
>
> s/with buffer/with a buffer/
> s/in buffer pool/in the buffer pool/
>
> + <row>
> + <entry><literal>lock_manager</></entry>
> + <entry>A server process is waiting on any one of the
> lock_manager locks
> + to add or examine locks for backends or waiting on joining/exiting
> + parallel group to perform parallel query.</entry>
> + </row>
>
> s/exiting parallel/exiting a parallel/
> s/perform parallel/perform a parallel/
>
> + <row>
> + <entry><literal>relation</></entry>
> + <entry>A server process is waiting to acquire lock on a
> relation.</entry>
> + </row>
>
> s/acquire lock/acquire a lock/
>
> + <row>
> + <entry><literal>tuple</></entry>
> + <entry>A server process is waiting to acquire a lock on tuple.</entry>
> + </row>
>
> s/on tuple/on a tuple/
>
> + <row>
> + <entry><literal>virtualxid</></entry>
> + <entry>A server process is waiting to acquire virtual xid
> lock.</entry>
> + </row>
>
> s/acquire virtual/acquire a virtual/
>
> The 5 rows that follow the above one can apply a similar substitution
> (i.e. s/acquire/acquire a/ )
>
> + <para>
> + For tranches registered by extensions, the name is specified by extension
> + and the same will be displayed as <structfield>wait_event</>. It is quite
> + possible that user has registered tranche in one of the backends (by
> + having allocation in dynamic shared memory) in which case other backends
> + won't have that information, so we display <literal>extension</> for such
> + cases.
> + </para>
>
> s/and the same will/and this will/
> s/has registered tranche/has registered the tranche/

Reading one of the corrections back, I wasn't happy with it, so I've changed:

"A server process which could be that an autovacuum worker or
autovacuum launcher is waiting to update or read the current state of
autovacuum workers."

to

"A server process which could be an autovacuum worker or autovacuum
launcher waiting to update or read the current state of autovacuum
workers."

Thom

Attachment Content-Type Size
extend_pg_stat_activity_v12b.patch binary/octet-stream 56.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2016-03-04 13:53:37 Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Previous Message Alexander Korotkov 2016-03-04 13:41:36 Re: RFC: replace pg_stat_activity.waiting with something more descriptive