Quick Links

Re: Get rid of pgstat_count_backend_io_op*() functions

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Get rid of pgstat_count_backend_io_op*() functions
Date:	2025-09-25 06:42:33
Message-ID:	aNTkWRZAg9FCBKy0@paquier.xyz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Sep 24, 2025 at 07:48:32AM +0000, Bertrand Drouvot wrote:
> On Wed, Sep 03, 2025 at 07:33:37AM +0000, Bertrand Drouvot wrote:
>> As far the ordering concern for v1, what about:
>>
>> - let backend kind enum defined as 6
>> - move the backend flush outside of the loop in pgstat_report_stat()
>>
>> That way the backend are flushed last and that solves your concern about custom
>> stats kind.
>>
>> The backend would not be the "only" exception in pgstat_report_stat(), the db
>> stats are already handled as exception with pgstat_update_dbstats().
>
> That would give something like v3 attached, thoughts?
>
> Once we agree on the approach to deal with per backend pending stats, I'll make
> use of the same in [1] and [2].

for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
{
[...]
+ if (kind == PGSTAT_KIND_BACKEND)
+ continue;
if (!kind_info)
continue;
if (!kind_info->flush_static_cb)

partial_flush |= kind_info->flush_static_cb(nowait);
}
+
+ /*
+ * Do per-backend last as some of their pending stats are populated
+ * during the flush of other stats kinds.
+ */
+ kind_info = pgstat_get_kind_info(PGSTAT_KIND_BACKEND);
+ partial_flush |= kind_info->flush_static_cb(nowait);
}

Hmm. I really have an issue with the ordering dependency this
introduces between stats kinds, assumed to happen inside the code. Up
to now, we have always considered stats kinds as rather independent
pieces at flush time, simplifying its code (I am not saying counter
increment time). I get that you are doing what you do here because
the IO pending statistics (PendingIOStats) are reset each time the
stat IO flush callback finishes its job, and we'd lose the amount of
stats saved between two reports.

Putting aside the style issue (aka I like the pgstat_count_backend_*
interface, rather than copying the pending data), I think that we lack
data to be able to clearly identify how and actually how much we
should try to optimize and change this code:
- Should we try to use more inlining?
- Should we try to use more static structures, without any functions
at all.
- Should we try to prefer copies? In which case, could it make sense
to introduce a concept of dependency between stats kinds? This way,
we could order how the flush actions are taken rather than having a
for loop based on the sole stats kind IDs.
- Should we try to reduce diff operations at flush time? This is
something done by the WAL stats and worry about their costs, with
diffs calculated each time between the current and previous states?
With a non-forced report delay of 10s, this does not worry me, short
transactions would, as we force reports each time a transaction
finishes.

For the first and second point, more optimization could be done for
more stats kinds. For the last point, we'd want to reconsider how
pgstat_wal.c does its job. As the proposal stands, at least it seems
to me, this sacrifices code readability, though I get that the opinion
of each may differ on the matter. And perhaps there are practices
that we may be able to use in core for all the stats kinds, as well.
That would mean doing benchmarks with specific workloads on big
hardware, likely. So, what I am seeing currently here is an
incomplete picture.
--
Michael

In response to

Re: Get rid of pgstat_count_backend_io_op*() functions at 2025-09-24 07:48:32 from Bertrand Drouvot

Responses

Re: Get rid of pgstat_count_backend_io_op*() functions at 2025-09-25 14:49:31 from Bertrand Drouvot

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2025-09-25 06:45:21	Re: Orphan page in _bt_split
Previous Message	Konstantin Knizhnik	2025-09-25 06:41:00	Re: Orphan page in _bt_split