RFC: replace pg_stat_activity.waiting with something more descriptive

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RFC: replace pg_stat_activity.waiting with something more descriptive
Date: 2015-06-22 17:37:41
Message-ID: CA+TgmoYd3GTz2_mJfUHF+RPe-bCy75ytJeKVv9x-o+SonCGApw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

When a PostgreSQL system wedges, or when it becomes dreadfully slow
for some reason, I often find myself relying on tools like strace,
gdb, or perf to figure out what is happening. This doesn't tend to
instill customers with confidence; they would like (quite
understandably) a process that doesn't require installing developer
tools on their production systems, and doesn't require a developer to
interpret the results, and perhaps even something that they could
connect up to PEM or Nagios or whatever alerting system they are

There are obviously many ways that we might think about improving
things here, but what I'd like to do is try to put some better
information in pg_stat_activity, so that when a process is not
running, users can get some better information about *why* it's not
running. The basic idea is that pg_stat_activity.waiting would be
replaced by a new column pg_stat_activity.wait_event, which would
display the reason why that backend is waiting. This wouldn't be a
free-form text field, because that would be too expensive to populate.
Instead it would contain a "reason code" which would be chosen from a
list of reason codes and translated to text for display. Internally,
pgstat_report_waiting() would be changed to take an integer argument
rather than a Boolean (possibly uint8 would be enough, certainly
uint16 would be), and called from more places. It would continue to
use an ordinary store into shared memory, with no atomic ops or

Currently, the only time we report a process as waiting is when it is
waiting for a heavyweight lock. I'd like to make that somewhat more
fine-grained, by reporting the type of heavyweight lock it's awaiting
(relation, relation extension, transaction, etc.). Also, I'd like to
report when we're waiting for a lwlock, and report either the specific
fixed lwlock for which we are waiting, or else the type of lock (lock
manager lock, buffer content lock, etc.) for locks of which there is
more than one. I'm less sure about this next part, but I think we
might also want to report ourselves as waiting when we are doing an OS
read or an OS write, because it's pretty common for people to think
that a PostgreSQL bug is to blame when in fact it's the operating
system that isn't servicing our I/O requests very quickly. We could
also invent codes for things like "I'm doing a pg_usleep because I've
exceeded max_spins_per_delay" and "I'm waiting for a cleanup lock on a
buffer" and maybe a few others.

I realize that in many cases these states will be quite transient and
you won't see them in pg_stat_activity for very long before they
vanish; whether you can catch them at all is quite uncertain. It's
not my goal here to create some kind of a performance counter system,
even though that would be valuable and could possibly be based on the
same infrastructure, but rather just to create a very simple system
that lets people know, without any developer tools, what is causing a
backend that has accepted a query and not yet returned a result to be
off-CPU rather than on-CPU. In the cases where there are many
backends, you may be able to see non-NULL results often enough to get
a sense of where the problem is; or in the case where there's one
backend that is persistently stuck, you will hopefully be able to tell
where it's stuck.


Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2015-06-22 17:44:27 Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Previous Message Prakash Itnal 2015-06-22 17:35:38 Re: Auto-vacuum is not running in 9.1.12