Re: Our naming of wait events is a disaster.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Our naming of wait events is a disaster.
Date: 2020-05-14 19:37:07
Message-ID: CA+TgmoaLEdDHpTiKtNkjD23WH3zYnj+hLiW4ZC0tZuLpSf7xeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 14, 2020 at 2:54 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Well, we could solve this problem very easily by ripping out everything
> having to do with wait-state monitoring ... and personally I'd be a lot
> in favor of that, because I haven't seen anything about either the
> design or documentation of the feature that I thought was very well
> done.

Well, I'm going to disagree with that, but opinions can vary. If I'd
tried to create naming consistency here when I created this stuff, I
would've had to rename things in existing systems rather than just
expose what was already there, and that wasn't the goal of the patch,
and I don't see a very good reason why it should have been. Providing
information is a separate project from cleaning up naming. And, while
I don't love the fact that people have added new things without trying
very hard to be consistent with existing things all that much, I still
don't think inconsistent naming rises to the level of a disaster.

> However, if you'd like to have wait-state monitoring, and you'd
> like the documentation for it to be more useful than "go read the code",
> then I don't see any way around the conclusion that there are going to
> be centralized lists of the possible wait states.
>
> That being the case, refusing to use a centralized list in the
> implementation seems rather pointless; and having some aspects of the
> implementation use centralized lists (see the enums in lwlock.h and
> elsewhere) while other aspects don't is just schizophrenic.

There's something to that argument, especially it enable us to
auto-generate the documentation tables.

That being said, my view of this system is that it's good to document
the wait events that we have, but also that there are almost certainly
going to be cases where we can't say a whole lot more than "go read
the code," or at least not without an awful lot of work. I think
there's a reasonable chance that someone who sees a lot of ClientRead
or DataFileWrite wait events will have some idea what kind of problem
is indicated, even without consulting the documentation and even
moreso if we have some good documentation which they can consult. But
I don't know what anybody's going to do if they see a lot of
OldSerXidLock or AddinShmemInitLock contention. Presumably those are
cases that the developers thought were unlikely, or they'd have chosen
a different locking regimen. If they were wrong, I think it's a good
thing for users to have a relatively easy way to find that out, but
I'm not sure what anybody's going to do be able to do about it without
patching the code, or at least looking at it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-05-14 19:45:26 Re: SLRU statistics
Previous Message Alvaro Herrera 2020-05-14 19:31:40 Re: new heapcheck contrib module