Re: Add session statistics to pg_stat_database

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add session statistics to pg_stat_database
Date: 2020-11-20 14:41:05
Message-ID: 2099659e5d679d070f0bb2edb573f296f3c9813f.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2020-11-17 at 17:33 +0100, Magnus Hagander wrote:
> I've taken a look as well, and here are a few short notes:

Much appreciated!

> * It talks about "number of connections" but "number of aborted sessions". We should probably
> be consistent about talking either about connections or sessions? In particular, connections
> seems wrong in this case, because it only starts counting after authentication is complete
> (since otherwise we send no stats)? (This goes for both docs and actual function names)

Yes, that is true. I have changed "connections" to "sessions" and renamed the new
column "connections" to "session_count".

I think that most people will understand a session as started after a successful
connection.

> * Is there a reason we're counting active and idle in transaction (including aborted),
> but not fastpath? In particular, we seem to ignore fastpath -- if we don't want to single
> it out specifically, it should probably be included in active?

The only reason is that I didn't think of it. Fixed.

> * pgstat_send_connstat() but pgstat_recv_connection(). Let's call both connstat or both
> connection (I'd vote connstat)?

Agreed, done.

> * Is this actually a fix that's independent of the new stats? It seems in general to be
> changing the behaviour of "force", which is more generic?
> - !have_function_stats)
> + !have_function_stats && !force)

The comment right above that reads:
/* Don't expend a clock check if nothing to do */
So it is just a quick exit if there is nothing to do.

But with that patch we have something to do if "force" (see below) is true:
Report the remaining session duration and if the session was closed normally.

Thus the additional check.

> * in pgstat_send_connstat() you pass the parameter "force" in as "disconnect".
> That behaviour at least requires a comment saying why, I think. My understanding is
> it relies on that "force" means this is a "backend is shutting down", but that is not
> actually documented anywhere. Maybe the "force" parameter should actually be renamed
> to indicate this is really what it means, to avoid a future mistake in the area?
> But even with that, how does that turn into disconnect?

"pgstat_report_stat(true)" is only called from "pgstat_beshutdown_hook()", so
it is currently only called when the backend is about to exit.

According the the comments the flag means that "caller wants to force stats out".
I guess that the author thought that there may arise other reasons to force sending
statistics in the future (commit 641912b4d from 2007).

However, since that has not happened, I have renamed the flag to "disconnect" and
adapted the documentation. This doesn't change the current behavior, but establishes
a new rule.

> * Maybe rename pgStatSessionDisconnected to pgStatSessionNormalDisconnected?
> To avoid having to go back to the setting point and look it up in a comment.

Long, descriptive names are a good thing.
I have decided to use "pgStatSessionDisconnectedNormally", since that is even longer
and seems to fit the "yes or no" category better.

> I wonder if there would also be a way to count "sessions that crashed" as well.
> That is,the ones that failed in a way that caused the postmaster to restart the system.
> But that's information we'd have to send from the postmaster, but I'm actually unsure
> if we're "allowed" to send things to the stats collector from the postmaster.
> But I think it could be quite useful information to have. Maybe we can find some way
> to piggyback on the fact that we're restarting the stats collector as a result?

Sure, a crash count would be useful. I don't know if it is easy for the stats collector
to tell the difference between a start after a backend crash and - say - starting from
a base backup.

Patch v6 attached.

I think that that would be material for another patch, and I don't think it should go
to "pg_stat_database", because a) it might be hard to tell to which database the crashed
backend was attached, b) it might be a background process that doesn't belong to a database
and c) if the crash were caused by - say - corruption in a shared catalog, it would be
misleading.

Yours,
Laurenz Albe

Attachment Content-Type Size
0001-Add-session-statistics-to-pg_stat_database.v6.patch text/x-patch 18.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2020-11-20 14:47:33 Re: VACUUM (DISABLE_PAGE_SKIPPING on)
Previous Message Dean Rasheed 2020-11-20 14:30:25 Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE