Re: Tracking wait event for latches

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tracking wait event for latches
Date: 2016-09-28 00:39:29
Message-ID: CAEepm=1i=BoHTWvSS5swRV_SEi-DkwvvJo46-aN9dSvW6DMA2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 26, 2016 at 7:07 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Mon, Sep 26, 2016 at 1:46 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> If the class really is strictly implied by the WaitEventIdentifier,
>> then do we really need to supply it everywhere when calling the
>> various wait functions? That's going to be quite a few functions:
>> WaitLatch, WaitLatchOrSocket, WaitEventSetWait for now, and if some
>> more patches out there have legs then also ConditionVariableWait,
>> BarrierWait, and possibly further kinds of wait points. And then all
>> their call sites will have have to supply the wait class ID, even
>> though it is implied by the other ID. Perhaps that array that
>> currently holds the names should instead hold { classId, displayName }
>> so we could just look it up?
>
> I considered this reverse-engineering, but arrived at the conclusion
> that having a flexible API mattered more to give more flexibility to
> module developers. In short this avoids having extra class IDs like
> ActivityExtention, TimeoutExtension, etc. But perhaps that's just a
> matter of taste..

Ok, if they really are independent then shouldn't we take advantage of
that at call sites where we might be idle but we might also be waiting
for the network? For example we could do this:

/* Sleep until something happens or we time out */
WaitLatchOrSocket(MyLatch, wakeEvents,
MyProcPort->sock, sleeptime,
pq_is_send_pending() ? WAIT_CLIENT :
WAIT_ACTIVITY,
WE_WAL_SENDER_MAIN);

Isn't WE_WAL_SENDER_WAIT_WAL primarily WAIT_IPC, not WAIT_CLIENT? Or
perhaps "pq_is_send_pending() ? WAIT_CLIENT : WAIT_IPC".

Actually, I'm still not sold on "Activity" and "Client". I think
"Idle" and "Network" would be better. Everybody knows intuitively
what "Idle" means. "Network" is better than "Client" because it
avoids confusion about user applications vs replication connections or
clients vs servers.

+ <listitem>
+ <para>
+ <literal>Activity</>: The server process is waiting for some
+ activity to happen on a socket. This is mainly used system processes
+ in their main processing loop. <literal>wait_event</> will identify
+ the type of activity waited for.
+ </para>
+ </listitem>

"The server process is waiting for some activity to happen on a
socket." Not true: could be a latch, or both.

I think what this category is really getting at is that the server
process is idle. Everything in it could otherwise go in the IPC or
Client categories on the basis that it's mainly waiting for a socket
or a latch, but we're deciding to separate the wait points
representing "idleness" and put them here.

How about: "The server process is idle. This is used by system
processes waiting for activity in their main processing loop.
<literal>wait_event</a> will identify the specific wait point."

+ <listitem>
+ <para>
+ <literal>Client</>: The server process is waiting for some activity
+ on a socket from a client process, and that the server expects
+ something to happen that is independent from its internal processes.
+ <literal>wait_event</> will identify the type of client activity
+ waited for.
+ </para>
+ </listitem>

Is it worth spelling out that "client process" here doesn't just mean
user applications, it's also remote PostgreSQL servers doing
replication? "wait_event" doesn't really identify the type of client
activity waited for, it identifies the code that is waiting.

+ <listitem>
+ <para>
+ <literal>Extension</>: The server process is waiting for activity
+ in a plugin or an extension. This category is useful for plugin
+ and module developers to track custom waiting points.
+ </para>
+ </listitem>

Plugin, extension, module? How about just "extension"? Why developers?

+ <listitem>
+ <para>
+ <literal>IPC</>: The server process is waiting for some activity
+ from an auxilliary process. <literal>wait_event</> will identify
+ the type of activity waited for.
+ </para>
+ </listitem>

s/auxilliary/auxiliary/, but I wouldn't it be better to say something
more general like "from another process in the cluster"? Background
workers are not generally called auxiliary processes, and some of
these wait points are waiting for those.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-09-28 01:04:14 Re: Transactions involving multiple postgres foreign servers
Previous Message Mark Dilger 2016-09-28 00:32:14 Re: gratuitous casting away const