Re: shared-memory based stats collector - v70

From: Greg Stark <stark(at)mit(dot)edu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: shared-memory based stats collector - v70
Date: 2022-08-10 19:48:15
Message-ID: CAM-w4HN+UeLJ8MLPk45hn5Xs-CYaW28ZxyfJ_6WNvwh5C9Nn6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

One thing that's bugging me is that the names we use for these stats
are *all* over the place.

The names go through three different stages

pgstat structs -> pgstatfunc tupledescs -> pg_stat_* views

(Followed by a fourth stage where pg_exporter or whatever names for
the monitoring software)

And for some reason both transitions (plus the exporter) felt the need
to fiddle with the names or values. And not in any sort of even
vaguely consistent way. So there are three (or four) different sets of
names for the same metrics :(

e.g.

* Some of the struct elements have abbreviated words which are
expanded in the tupledesc names or the view columns -- some have long
names which get abbreviated.

* Some struct members have n_ prefixes (presumably to avoid C keywords
or other namespace issues?) and then lose them at one of the other
stages. But then the relation stats do not have n_ prefixes and then
the pg_stat view *adds* n_ prefixes in the SQL view!

* Some columns are added together in the SQL view which seems like
gratuitously hiding information from the user. The pg_stat_*_tables
view actually looks up information from the indexes stats and combines
them to get idx_scan and idx_tup_fetch.

* The pg_stat_bgwriter view returns data from two different fixed
entries, the checkpointer and the bgwriter, is there a reason those
are kept separately but then reported as if they're one thing?

Some of the simpler renaming could be transparently fixed by making
the internal stats match the public facing names. But for many of them
I think the internal names are better. And the cases where the views
aggregate data in a way that loses information are not something I
want to reproduce.

I had intended to use the internal names directly, reasoning that
transparency and consistency are the direction to be headed. But in
some cases I think the current public names are the better choice -- I
certainly don't want to remove n_* prefixes from some names but then
add them to different names! And some of the cases where the data is
combined or modified do seem like they would be missed.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-08-10 20:28:45 Re: pg_auth_members.grantor is bunk
Previous Message Robert Haas 2022-08-10 18:50:30 Re: something has gone wrong, but what is it?