Re: Per backend relation statistics tracking

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Per backend relation statistics tracking
Date: 2025-08-26 06:38:41
Message-ID: aK1WcZZyMXuub5M2@ip-10-97-1-34.eu-west-3.compute.internal
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Aug 25, 2025 at 08:28:04PM -0400, Andres Freund wrote:
> Hi,
>
> On 2025-08-12 07:48:10 +0000, Bertrand Drouvot wrote:
> > From 9e2f8cb9a87f1d9be91f2f39ef25fbb254944968 Mon Sep 17 00:00:00 2001
> > From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
> > Date: Mon, 4 Aug 2025 08:14:02 +0000
> > Subject: [PATCH v1 01/10] Adding per backend relation statistics tracking
> >
> > This commit introduces per backend relation stats tracking and adds a
> > new PgStat_BackendRelPending struct to store the pending statistics. To begin with,
> > this commit adds a new counter (heap_scan) to record the number of sequential
> > scans initiated on tables.
> >
> > This commit relies on the existing per backend statistics machinery that has been
> > added in 9aea73fc61d.
> > ---
> > src/backend/access/heap/heapam.c | 3 ++
> > src/backend/utils/activity/pgstat_backend.c | 59 +++++++++++++++++++++
> > src/include/pgstat.h | 14 +++++
> > src/include/utils/pgstat_internal.h | 3 +-
> > src/tools/pgindent/typedefs.list | 1 +
> > 5 files changed, 79 insertions(+), 1 deletion(-)
> > 73.9% src/backend/utils/activity/
> > 7.4% src/include/utils/
> > 15.4% src/include/
> > 3.2% src/
> >
> > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> > index 0dcd6ee817e..d9d6fb6c6ea 100644
> > --- a/src/backend/access/heap/heapam.c
> > +++ b/src/backend/access/heap/heapam.c
> > @@ -467,7 +467,10 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
> > * and for sample scans we update stats for tuple fetches).
> > */
> > if (scan->rs_base.rs_flags & SO_TYPE_SEQSCAN)
> > + {
> > pgstat_count_heap_scan(scan->rs_base.rs_rd);
> > + pgstat_count_backend_rel_heap_scan();
> > + }
> > }
> >
>
> I don't like that this basically doubles the overhead of keeping stats by
> tracking everythign twice. The proper solution is to do that not in the hot
> path (i.e. in scans), but when summarizing stats to be flushed to the shared
> stats.

I do agree, like when the relations stats are flushed then we do update the
database ones too. I'll use the same approach in the next revision.

> FWIW, I think this was done wrongly for the per-backend IO stats too. I've
> seen the increased overhead in profiles

That's indeed something that could be improved for the backends IO stats too. I'll
work on this for 19 (that's probably too late for 18 and not that alarming?).

> and IO related counters aren't
> incremented remotely as often as the scan related counters are.

You mean the flush are not triggered as often? If so, yeah that's also something
you've mentioned ([1]) and that I've in mind to look at.

[1]: https://www.postgresql.org/message-id/erpzwxoptqhuptdrtehqydzjapvroumkhh7lc6poclbhe7jk7l%40l3yfsq5q4pw7

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-08-26 06:44:56 Re: Conflict detection for update_deleted in logical replication
Previous Message Jim Jones 2025-08-26 06:30:39 Re: Improve error message for duplicate labels in enum types