Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Michael Paquier <michael(at)paquier(dot)xyz>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, magnus(at)hagander(dot)net, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Date: 2021-09-03 02:51:46
Message-ID: 9672c9b16be3d5e5d6252c30254f6ecaf782868e.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2021-09-01 at 10:56 +0200, Laurenz Albe wrote:
> On Tue, 2021-08-31 at 21:16 -0700, Andres Freund wrote:
> > On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote:
> > > On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote:
> > > > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the
> > > > > > view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
> > > > > > milliseconds?  That would mean slightly less up-to-date statistics, but
> > > > > > I doubt that that will be a problem.
> > > >
> > > > I think it's not helpful. Still increases the number of messages substantially in workloads
> > > > with a lot of connections doing occasional queries. Which is common.
> > >
> > > How come?  If originally you send table statistics every 500ms, and now you send
> > > table statistics and session statistics every second, that should amount to the
> > > same thing.  Where is my misunderstanding?
> >
> > Consider the case of one query a second.
>
> I guess I am too stupid.  I don't see it.

Finally got it. That would send a message every second, and with connection statistics,
twice as many.

Here is my next suggestion for a band-aid to mitigate this problem:
Introduce a second, much longer interval for reporting session statistics.

--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -77,6 +77,8 @@
#define PGSTAT_STAT_INTERVAL 500 /* Minimum time between stats file
* updates; in milliseconds. */

+#define PGSTAT_CONSTAT_INTERVAL 60000 /* interval to report connection statistics */
+
#define PGSTAT_RETRY_DELAY 10 /* How long to wait between checks for a
* new file; in milliseconds. */

@@ -889,8 +891,13 @@ pgstat_report_stat(bool disconnect)
!TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL))
return;

- /* for backends, send connection statistics */
- if (MyBackendType == B_BACKEND)
+ /*
+ * For backends, send connection statistics, but only every
+ * PGSTAT_CONSTAT_INTERVAL or when the backend terminates.
+ */
+ if (MyBackendType == B_BACKEND &&
+ (TimestampDifferenceExceeds(last_report, now, PGSTAT_CONSTAT_INTERVAL) ||
+ disconnect))
pgstat_send_connstats(disconnect, last_report, now);

last_report = now;

That should keep the extra load moderate, except for workloads with lots of tiny connections
(for which this may be the least of their problems).

Yours,
Laurenz Albe

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2021-09-03 02:56:08 Re: replay of CREATE TABLESPACE eats data at wal_level=minimal
Previous Message Peter Smith 2021-09-03 02:17:37 Re: row filtering for logical replication