Re: Add session statistics to pg_stat_database

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com>, ikedamsh(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add session statistics to pg_stat_database
Date: 2020-10-14 09:28:36
Message-ID: f9553a0de5f13b89e8e0c3823a5ff17d56fa8a82.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the --- as always --- valuable review!

On Tue, 2020-10-13 at 17:55 -0500, Justin Pryzby wrote:
> On Tue, Oct 13, 2020 at 01:44:41PM +0200, Laurenz Albe wrote:
> > Attached is v3 with improvements.
>
> + <para>
> + Time spent in database sessions in this database, in milliseconds.
> + </para></entry>
>
> Should say "Total time spent *by* DB sessions..." ?

That is indeed better. Fixed.

> I think these counters are only accurate as of the last state change, right?
> So a session which has been idle for 1hr, that 1hr is not included. I think
> the documentation should explain that, or (ideally) the implementation would be
> more precise. Maybe the timestamps should only be updated after a session
> terminates (and the docs should say so).

I agree, and I have added an explanation that the value doesn't include
the duration of the current state.

Of course it would be nice to have totally accurate values, but I think
that the statistics are by nature inaccurate (datagrams can get lost),
and more frequent statistics updates increase the work load.
I don't think that is worth the effort.

> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>connections</structfield> <type>bigint</type>
> + </para>
> + <para>
> + Number of connections established to this database.
>
> *Total* number of connections established, otherwise it sounds like it might
> mean "the number of sessions [currently] established".

Fixed like that.

> + Number of database sessions to this database that did not end
> + with a regular client disconnection.
>
> Does that mean "sessions which ended irregularly" ? Or does it also include
> "sessions which have not ended" ?

I have added an explanation for that.

> + msg.m_aborted = (!disconnect || pgStatSessionDisconnected) ? 0 : 1;
>
> I think this can be just:
> msg.m_aborted = (bool) (disconnect && !pgStatSessionDisconnected);

I mulled over this and finally decided to leave it as it is.

Since "m_aborted" gets added to the total counter, I'd prefer to
have it be an "int".

Your proposed code works (the cast is actually not necessary, right?).
But I think that my version is more readable if you think of
"m_aborted" as a counter rather than a flag.

> + if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
> + result = 0;
> + else
> + result = ((double) dbentry->n_session_time) / 1000.0;
>
> I think these can say:
> > double result = 0;
> > if ((dbentry=..) != NULL)
> > result = (double) ..;
>
> That not only uses fewer LOC, but also the assignment to zero is (known to be)
> done at compile time (BSS) rather than runtime.

I didn't know about the performance difference.
Concise code (if readable) is good, so I changed the code like you propose.

The code pattern is actually copied from neighboring functions,
which then should also be changed like this, but that is outside
the scope of this patch.

Attached is v4 of the patch.

Yours,
Laurenz Albe

Attachment Content-Type Size
0001-Add-session-statistics-to-pg_stat_database.v4.patch text/x-patch 17.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2020-10-14 09:43:33 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Bharath Rupireddy 2020-10-14 09:16:11 Re: Parallel Inserts in CREATE TABLE AS