From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Sami Imseih <samimseih(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Per backend relation statistics tracking |
Date: | 2025-08-26 06:12:17 |
Message-ID: | aK1QQRiBrKq1BKdr@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Tue, Aug 26, 2025 at 08:12:45AM +0900, Michael Paquier wrote:
> On Mon, Aug 25, 2025 at 05:51:38PM -0500, Sami Imseih wrote:
> > I have not gone through them in detail yet, but +1 on adding backend activity
> > stats. This provides another level of drill down to spot anomalous sessions or
> > different patterns across applications. I also think we will want more than
> > just relation stats. For example, columns from pg_statio already look useful on
> > a per-backend aggregate level. Beyond that, I can imagine future additions like
> > number of transactions, subtransactions, I/O stats, conflicts, etc. All of these
> > seem like valuable per-backend aggregates.
>
> Even if I see this data, I am not sure how I
> would use it in correlation with the existing pg_statio_* to tune
> something, the existing views being sufficient to tune relation-level
> parameters, no?
Right, but the "opposite" is also true, how would you:
- Debug a specific increase in IO reads: When pg_stat_get_backend_io() shows
an increase in reads/read_bytes for backends, this new view allows deeper
investigation to determine if those reads are caused by table sequential scans
and how many rows are involved.
- Verify load distribution among sessions: Check if database activities are
evenly distributed across backends. Uneven distribution
could indicate outdated application versions on some hosts or suboptimal
connection pool configurations.
- Set up monitoring alerts for backends showing unusual database access patterns.
- If backends from one host show significantly more sequential scans
than others, this could indicate an outdated application version on that
host running inefficient queries.
- One could categorize applications based on workload patterns and track how
different applications use the engine.
With "only" the relation's stats at your disposal?
The relations are the "destinations" of the activity while the backends are the
"source". I think it's good to track both.
> I have equally some doubts about the value of the vacuum and analyze
> count, including the time of their latest runs. Manual vacuums are
> something that some do because autovacuum is not able to keep up,
> still the encouraged patterns involve running these in periods of
> lower-activity. How would knowing about the whole number of vacuums
> and/or analyze be useful if these are run with cron jobs, which I
> suspect involve connections that live only for the duration of one
> single job?
This one was more to track "unwanted" manual vacuums. The timestamp could
help to know which relation is involved (with a where clause on the timestamp).
> Transactions and subtransactions may be interesting to consider.
> Perhaps mainly useful to evaluate the balance activity, say with a
> connection pooler with some specific configuration or when checking
> transaction activity across multiple databases by joining with
> pg_stat_activity?
Yes exactly. And also to get XID (virtual transactions excluded) consumption
rate per backend like in [1]. In [1], the number of XIDs generated per backend
has been added.
[1]: https://www.postgresql.org/message-id/aJhOfZ2c7XdHXOAU%40ip-10-97-1-34.eu-west-3.compute.internal
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2025-08-26 06:15:28 | Re: Per backend relation statistics tracking |
Previous Message | Dilip Kumar | 2025-08-26 06:01:40 | Re: Potential problem in commit f777d773878 and 4f7f7b03758 |