Re: Tracking wait event for latches

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tracking wait event for latches
Date: 2016-09-21 03:40:41
Message-ID: CAB7nPqQc2S=-GQRtk2rZPVrwQ6uzmUzY6_uE5Zxq4dNTp-3Brg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Wed, Sep 21, 2016 at 8:13 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> + <para>
> + <literal>EventSet</>: The server process is waiting on a socket
> + or a timer. This wait happens in a latch, an inter-process facility
> + using boolean variables letting a process sleep until it is set.
> + Latches support several type of operations, like postmaster death
> + handling, timeout and socket activity lookup, and are a useful
> + replacement for <function>sleep</> where signal handling matters.
> + </para>
>
> This paragraph seems a bit confused. I suggest something more like
> this: "The server process is waiting for one or more sockets, a timer
> or an interprocess latch. The wait happens in a WaitEventSet,
> <productname>PostgreSQL</>'s portable IO multiplexing abstraction."

OK, I have tweaked the paragraph as follows using your suggestion:
+ <listitem>
+ <para>
+ <literal>EventSet</>: The server process is waiting on one or more
+ sockets, a time or an inter-process latch. The wait happens in a
+ <function>WaitEventSet</>, <productname>PostgreSQL</>'s portable
+ I/O multiplexing abstraction using boolean variables letting a
+ process sleep until it is set. It supports several type of
+ operations, like postmaster death handling, timeout and socket
+ activity lookup, and are a useful replacement for <function>sleep</>
+ where signal handling matters.
+ </para>
+ </listitem>

> On Tue, Aug 23, 2016 at 7:01 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>> On Mon, Aug 22, 2016 at 12:09 PM, Alexander Korotkov
>>> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>>>> Would it be better to use an array here?
>>
>> Okay, I have switched to an array....
>
> I looked at various macro tricks[1] but they're all pretty unpleasant!
> +1 for the simple array with carefully managed order. About that
> order...
> [1] http://stackoverflow.com/questions/147267/easy-way-to-use-variables-of-enum-types-as-string-in-c

Yes, I recall bumping on this one, or something really similar to that...

> +const char *const WaitEventNames[] = {
> [...]
> +};
>
> It looks like this array wants to be in alphabetical order, but it
> isn't quite. Also, perhaps a compile time assertion about the size of
> the array matching EVENT_LAST_TYPE could be useful?

In GetWaitEventIdentifier()? I'd think that just returning ??? would
have been fine if there is a non-matching call.

> +typedef enum WaitEventIdentifier
> +{
> [...]
> +} WaitEventIdentifier;
>
> This is also nearly but not exactly in alphabetical order
> (EVENT_PROC_SIGNAL comes after EVENT_PROC_SLEEP for example). But
> it's not currently possible to have the strings and the enumerators
> both in alphabetical order because they're not the same, even
> accounting for camel-case to upper-case transliteration. I think at
> least one of them should be sorted. Shouldn't they match fully and
> then *both* be sorted alphabetically? For example
> "MessageQueueInternal" doesn't match EVENT_MQ_WAIT_INTERNAL. Then
> there are some cases where I'd expect underscores for consistency with
> other enumerators and with the corresponding camel-case strings: you
> have EVENT_WAL_SENDER_MAIN, but EVENT_WALWRITER_MAIN.

Not wrong..

> The documentation is in a slightly different order again but also not
> exactly alphabetical: for example ProcSleep is listed before
> ProcSignal.

Right.

> Sorry if this all sounds terribly picky but I think we should try to
> be strictly systematic here.

No worries about that, it matters a lot for this patch. The user-faced
documentation is what should do the decision-making I think. So let's
order the names, and adapt the enum depending on that. I have done so
after double-checking both lists, and added a comment for anybody
updating that in the fiture.

>>>> EventIdentifier seems too general name for me, isn't it? Could we name it
>>>> WaitEventIdentifier? Or WaitEventId for shortcut?
>>
>> ... And WaitEventIdentifier.
>
> +1 from me too for avoiding the overly general term 'event'. It does
> seem a little odd to leave the enumerators names as EVENT_... though;
> shouldn't these be WAIT_EVENT_... or WE_...? Or perhaps you could
> consider WaitPointIdentifier and WP_SECURE_READ or
> WaitEventPointIdentifier and WEP_SECURE_READ, if you buy my earlier
> argument that what we are really naming here is point in the code
> where we wait, not the events we're waiting for. Contrast with
> LWLocks where we report the lock that you're waiting for, not the
> place in the code where you're waiting for that lock.

Well, WE_ if I need make a choice for something else than EVENT_.

> On the other hand, if we could *accurately* report whether it's a
> "Latch", "Socket" or "Latch|Socket" that we're waiting for, it might
> be cool to do that instead. One way to do that would be to use up
> several class IDs: WAIT_EVENT_LATCH, WAIT_EVENT_LATCH_OR_SOCKET,
> WAIT_EVENT_SOCKET (or perhaps WAIT_EVENT_LATCH | WAIT_EVENT_SOCKET,
> reserving 2 or 3 upper bits from the 8 bit class ID for this). Then
> we could figure out the right class ID by looking at set->latch and
> set->nevents (perhaps an extra boolean would be needed to record
> whether postmaster death is in there so we could deduce whether there
> are any sockets). It would be interesting specifically for the case
> of FDWs where it would be nice to be able to see clearly that it's
> waiting for a remote server ("Socket"). It may also be interesting to
> know if there is a timeout. Postmaster death doesn't seem newsworthy,
> we're nearly always also waiting for that exceptional event so it'd
> just be clutter to report it.

That's actually pretty close to what I mentioned upthread here:
https://www.postgresql.org/message-id/CAB7nPqQx4OEym9cf22CY%3D5eWqqiAMjij6EBCoNReezt9-NvGkw%40mail.gmail.com
In order to support that adding a column wait_event_details with
text[] makes the most sense I guess. Still I think that's another
discussion, this patch does already a lot.

So I have adjusted the patch in many ways, tweaked the order of the
items, and adjusted some of their names as suggested by Thomas.
Updated version attached.
--
Michael

Attachment Content-Type Size
wait-event-set-v5.patch text/plain 29.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2016-09-21 04:03:00 Re: Tracking wait event for latches
Previous Message Steve Singer 2016-09-21 03:35:12 Re: Logical Replication WIP