Re: RFC: replace pg_stat_activity.waiting with something more descriptive

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date: 2015-07-06 14:48:41
Message-ID: CAHGQGwFjQ3pmv8Yeknxtz4G=ntZRqP6NHwrqkcSGpRQufaboJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 26, 2015 at 12:39 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jun 25, 2015 at 9:23 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> On 6/22/15 1:37 PM, Robert Haas wrote:
>>> Currently, the only time we report a process as waiting is when it is
>>> waiting for a heavyweight lock. I'd like to make that somewhat more
>>> fine-grained, by reporting the type of heavyweight lock it's awaiting
>>> (relation, relation extension, transaction, etc.). Also, I'd like to
>>> report when we're waiting for a lwlock, and report either the specific
>>> fixed lwlock for which we are waiting, or else the type of lock (lock
>>> manager lock, buffer content lock, etc.) for locks of which there is
>>> more than one. I'm less sure about this next part, but I think we
>>> might also want to report ourselves as waiting when we are doing an OS
>>> read or an OS write, because it's pretty common for people to think
>>> that a PostgreSQL bug is to blame when in fact it's the operating
>>> system that isn't servicing our I/O requests very quickly.
>>
>> Could that also cover waiting on network?
>
> Possibly. My approach requires that the number of wait states be kept
> relatively small, ideally fitting in a single byte. And it also
> requires that we insert pgstat_report_waiting() calls around the thing
> that is notionally blocking. So, if there are a small number of
> places in the code where we do network I/O, we could stick those calls
> around those places, and this would work just fine. But if a foreign
> data wrapper, or any other piece of code, does network I/O - or any
> other blocking operation - without calling pgstat_report_waiting(), we
> just won't know about it.

Probably Itagaki-san's very similar proposal and patch would be useful
to consider what wait events to track.
http://www.postgresql.org/message-id/20090309125146.913C.52131E4D@oss.ntt.co.jp

According to his patch, the wait events that he was thinking to add were:

+ typedef enum PgCondition
+ {
+ PGCOND_UNUSED = 0, /* unused */
+
+ /* 10000 - CPU */
+ PGCOND_CPU = 10000, /* generic cpu operations */
+ /* 11000 - CPU:PARSE */
+ PGCOND_CPU_PARSE = 11000, /* pg_parse_query */
+ PGCOND_CPU_PARSE_ANALYZE = 11100, /* parse_analyze */
+ /* 12000 - CPU:REWRITE */
+ PGCOND_CPU_REWRITE = 12000, /* pg_rewrite_query */
+ /* 13000 - CPU:PLAN */
+ PGCOND_CPU_PLAN = 13000, /* pg_plan_query */
+ /* 14000 - CPU:EXECUTE */
+ PGCOND_CPU_EXECUTE = 14000, /* PortalRun or
PortalRunMulti */
+ PGCOND_CPU_TRIGGER = 14100, /* ExecCallTriggerFunc */
+ PGCOND_CPU_SORT = 14200, /* (generic sort operation) */
+ PGCOND_CPU_SORT_HEAP = 14210, /* tuplesort_begin_heap */
+ PGCOND_CPU_SORT_INDEX = 14220, /* tuplesort_begin_index_btree */
+ PGCOND_CPU_SORT_DATUM = 14230, /* tuplesort_begin_datum */
+ /* 15000 - CPU:UTILITY */
+ PGCOND_CPU_UTILITY = 15000, /* ProcessUtility */
+ PGCOND_CPU_COMMIT = 15100, /* CommitTransaction */
+ PGCOND_CPU_ROLLBACK = 15200, /* AbortTransaction */
+ /* 16000 - CPU:TEXT */
+ PGCOND_CPU_TEXT = 16000, /* (generic text operation) */
+ PGCOND_CPU_DECODE = 16100, /* pg_client_to_server */
+ PGCOND_CPU_ENCODE = 16200, /* pg_server_to_client */
+ PGCOND_CPU_LIKE = 16310, /* GenericMatchText */
+ PGCOND_CPU_ILIKE = 16320, /* Generic_Text_IC_like */
+ PGCOND_CPU_RE = 16400, /* (generic regexp operation) */
+ PGCOND_CPU_RE_COMPILE = 16410, /* RE_compile_and_cache */
+ PGCOND_CPU_RE_EXECUTE = 16420, /* RE_execute */
+
+ /* 20000 - NETWORK */
+ PGCOND_NETWORK = 20000, /* (generic network
operation) */
+ PGCOND_NETWORK_RECV = 21000, /* secure_read */
+ PGCOND_NETWORK_SEND = 22000, /* secure_write */
+
+ /* 30000 - IDLE (should be larger than network to distinguish
idle or recv) */
+ PGCOND_IDLE = 30000, /* <IDLE> */
+ PGCOND_IDLE_IN_TRANSACTION = 31000, /* <IDLE> in transaction */
+ PGCOND_IDLE_SLEEP = 32000, /* pg_usleep */
+
+ /* 40000 - XLOG */
+ PGCOND_XLOG = 40000, /* (generic xlog operation) */
+ PGCOND_XLOG_CRC = 41000, /* crc calculation in
XLogInsert */
+ PGCOND_XLOG_INSERT = 42000, /* insert in XLogInsert */
+ PGCOND_XLOG_OPEN = 43000, /* XLogFileOpen */
+ PGCOND_XLOG_CLOSE = 44000, /* XLogFileClose */
+ PGCOND_XLOG_WRITE = 45000, /* write in XLogWrite */
+ PGCOND_XLOG_FLUSH = 46000, /* issue_xlog_fsync */
+
+ /* 50000 - DATA */
+ PGCOND_DATA = 50000, /* (generic data operation) */
+ PGCOND_DATA_CREATE = 51000, /* smgrcreate */
+ PGCOND_DATA_OPEN = 52000, /* smgropen */
+ PGCOND_DATA_CLOSE = 53000, /* smgrclose */
+ PGCOND_DATA_STAT = 54000, /* smgrnblocks */
+ PGCOND_DATA_READ = 55000, /* smgrread */
+ PGCOND_DATA_PREFETCH = 56000, /* smgrprefetch */
+ PGCOND_DATA_WRITE = 57000, /* smgrwrite */
+ PGCOND_DATA_EXTEND = 58000, /* smgrextend */
+
+ /* 60000 - TEMP */
+ PGCOND_TEMP = 60000, /* (generic temp file
operation) */
+ PGCOND_TEMP_READ = 61000, /* BufFileRead */
+ PGCOND_TEMP_WRITE = 62000, /* BufFileWrite */
+
+ /* 70000 - LOCK */
+ PGCOND_LOCK = 70000, /* waiting on a lmgr lock */
+ /* 70001-70999 is reserved for lmgr locks */
+
+ /* 80000 - LWLOCK */
+ PGCOND_LWLOCK = 80000, /* waiting on a generic lwlock */
+ /* 80001-80999 is reserved for named lwlocks */
+ PGCOND_LWLOCK_BUFMAPPING = 81000, /* BufMappingLock(s) */
+ PGCOND_LWLOCK_LOCKMGR = 82000, /* LockMgrLock(s) */
+ PGCOND_LWLOCK_PAGE = 83000, /* BufferDesc.content_lock */
+ PGCOND_LWLOCK_IO = 84000, /*
BufferDesc.io_in_progress_lock */
+
+ /* 90000 - SPINLOCK */
+ PGCOND_SPINLOCK = 90000 /* timeout in s_lock */
+ } PgCondition;

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2015-07-06 14:49:02 Re: dblink: add polymorphic functions.
Previous Message Merlin Moncure 2015-07-06 14:37:45 Re: dblink: add polymorphic functions.