Re: Tracking wait event for latches

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tracking wait event for latches
Date: 2016-09-23 01:32:28
Message-ID: CA+TgmoaqTpzrNrOLt2_Zgr_4C5RDwr5kg81aQYW-b9UspT9mkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 22, 2016 at 7:10 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Interesting. OK, I agree that it'd be useful to show that we're
> waiting because there's nothing happening, or waiting because the user
> asked us to sleep, or waiting on IO, or waiting for an IPC response
> because something is happening, and that higher level information is
> difficult/impossible to extract automatically from the WaitEventSet.

Cool. :-)

> I understand that "Activity" is the category of wait points that are
> waiting for activity, but I wonder if it might be clearer to users if
> that were called "Idle", because it's the category of idle waits
> caused by non-activity.

I thought about that but figured it would be better to consistently
state the thing *for which* we were waiting. We wait FOR a client or
a timeout or activity. We do not wait FOR idle; we wait to be NOT
idle.

> Why is WalSenderMain not in that category alongside WalReceiverMain?
> Hmm, actually it's kind of a tricky one: whether it's really idle or
> waiting for IO depends. It's always ready to wait for clients to send
> messages, but I'd say that's part of its "idle" behaviour. But it's
> sometimes waiting for the socket to be writable: if
> (pq_is_send_pending()) wakeEvents |= WL_SOCKET_WRITABLE, and that's
> when it's definitely not idle, it's actively trying to feed WAL down
> the pipe. Do we want to get into dynamic categories depending on
> conditions like that?

I suspect that's overkill. I don't want wait-point-naming to make
programming the system noticeably more difficult, so I think it's fine
to pick a categorization of what we think the typical case will be and
call it good. If we try that and people find it's a nuisance, we can
fix it then. In the case of WAL sender, I assume it will normally be
waiting for more WAL to be generated; whereas in the case of WAL
receiver, I assume it will normally be waiting for more WAL to be
received from the remote side. The reverse cases are possible: the
sender could be waiting for the socket buffer to drain so it can push
more WAL onto the wire, and the receiver could likewise be waiting for
buffer space to push out feedback messages. But probably mostly not.
At least for a first cut, I'd be inclined to handle this fuzziness by
putting weasel-words in the documentation rather than by trying to
make the reporting 100% perfectly accurate.

> I was thinking about suggesting a category "Replication" to cover the
> waits for client IO relating to replication, as opposed to client IO
> waits relating to regular user connections. Then you could put sync
> rep into that category instead of IPC, even though technically it is
> waiting for IPC from walsender process(es), on the basis that it's
> more newsworthy to a DBA that it's really waiting for a remote replica
> to respond. But it's probably pretty clear what's going on from the
> the wait point names, so maybe it's not worth a category. Thoughts?

I thought about a replication category but either it will only have
SyncRep in it, which is odd, or it will pull in other things that
otherwise fit nicely into the Activity category, and then that
boundaries of all the categories become mushy: is the subsystem that
causes the wait that we are trying to document, or the kind of thing
for which we are waiting?

> I do suspect that the set of wait points will grow quite a bit as we
> develop more parallel stuff though. For example, I have been working
> on a patch that adds several more wait points, indirectly via
> condition variables (using your patch). Actually in my case it's
> BarrierWait -> ConditionVariableWait -> WaitEventSetWait. I propose
> that these higher level wait primitives should support passing a wait
> point identifier through to WaitEventSetWait.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2016-09-23 01:47:19 Re: Speed up Clog Access by increasing CLOG buffers
Previous Message Robert Haas 2016-09-23 01:20:23 Re: Speed up Clog Access by increasing CLOG buffers