Re: Tracking wait event for latches

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tracking wait event for latches
Date: 2016-06-02 03:25:19
Message-ID: CAEepm=2Pn-5iDdNWSAZ-3UipDgDpJu+j8wr1PNo8gnmyb-Y18w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 20, 2016 at 8:14 AM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> Hi all,
>
> As I mentioned $subject a couple of months back after looking at the
> wait event facility here:
> http://www.postgresql.org/message-id/CAB7nPqTJpgAvOK4qSC96Fpm5W+aCtJ9D=3Vn9AfiEYsur=-juw@mail.gmail.com
> I have actually taken some time to implement this idea.
>
> The particular case that I had in mind was to be able to track in
> pg_stat_activity processes that are waiting on a latch for synchronous
> replication, and here is what this patch gives in this case:
> =# alter system set synchronous_standby_names = 'foo';
> ALTER SYSTEM
> =# select pg_reload_conf();
> pg_reload_conf
> ----------------
> t
> (1 row)
> =# -- Do something
> [...]
>
> And from another session:
> =# select wait_event_type, wait_event from pg_stat_activity where pid = 83316;
> wait_event_type | wait_event
> -----------------+------------
> Latch | SyncRep
> (1 row)
>
> This is a boring patch, and it relies on the wait event facility that
> has been added recently in 9.6. Note a couple of things though:
> 1) There is something like 30 code paths calling WaitLatch in the
> backend code, all those events are classified and documented similarly
> to LWLock events.
> 2) After discussing this stuff while at PGCon, it does not seem worth
> to have any kind of APIs to be able to add in shared memory custom
> latch names that extensions could load through _PG_init(). In
> replacement to that, there is a latch type flag called "Extension"
> that can be used for this purpose.
> Comments are welcome, I am adding that in the 9.7 queue.

This is very cool, and I can't wait to have this feature! It'll be
useful for all kinds of developers and users. I wanted this today to
help debug something I am working on, and remembered this patch. I
have signed up as a reviewer for the September commitfest. But here
is some initial feedback based on a quick glance at it:

This patch allows identifiers to be specified by the WaitLatch and
WaitLatchOrSocket calls, but not for WaitEventSetWait, which is the
more general waiting primitive. I think it should be done by
WaitEventSetWait, and merely passed down by those wrappers functions.

These are actually names for *wait points*, not names for latches.
Some of the language in this patch makes it sound like they are latch
names/latch identifiers, which is inaccurate (the latches themselves
wouldn't be very interesting eg MyLatch). In some cases the main
thing of interest is actually a socket or timer anyway, not a latch,
so maybe it should appear as wait_event_type = WaitEventSet?

Is there a reason you chose names like 'WALWriterMain'? That
particular wait point is in the function WalWriterMain (note different
case). It seems odd to use the containing function names to guide
naming, but not use them exactly. Since this namespace needs to be
able to name wait points anywhere in the postgres source tree (and
maybe outside it too, for extensions), maybe it should be made
hierarchical, like 'walwriter.WalWriterMain' (or some other
organisational scheme).

I personally think it would be very useful for extensions to be able
to name their wait points. For example, I'd rather see
'postgres_fdw.pgfdw_get_result' or similar than a vague 'Extension'
string which appears not only for all wait points in an extension but
also for all extensions. I hope we can figure out a good way to do
that. Clearly that would involve some shmem registry machinery to
make the names visible across backends (a similar problem exists with
lock tranche names).

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-06-02 04:00:19 Re: COMMENT ON, psql and access methods
Previous Message Noah Misch 2016-06-02 01:38:24 Re: [sqlsmith] Failed assertions on parallel worker shutdown