Re: Get rid of pgstat_count_backend_io_op*() functions

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Get rid of pgstat_count_backend_io_op*() functions
Date: 2025-09-03 05:47:51
Message-ID: aLfWhz8kwjsvrjah@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 02, 2025 at 12:42:54PM -0400, Andres Freund wrote:
> On 2025-09-02 08:19:22 +0900, Michael Paquier wrote:
>> On Mon, Sep 01, 2025 at 02:07:27PM +0000, Bertrand Drouvot wrote:
>>> Instead, it now copies the IO pending stats to the backend IO pending stats when
>>> the pending IO stats are flushed. This behaves the same way as for some relation
>>> and database stats (see pgstat_relation_flush_cb()).
>>>
>>> It's done that way to avoid incrementing the "same" counters twice as it produces
>>> increased overhead in profiles (reported by Andres in [1]).
>>
>> So, is the complaint about the addition of the extra incrementations
>> for backend counters on top of the existing IO counters in v17,
>> leading to more instructions generated, the addition of the functions,
>> or both things at the same time?
>
> The latter, I think.

Another option would be more inlining, applying the same to the
pgstat_io.c parts. It sounds to me like that limiting the number of
incrementations to happen only once when shared across multiple stats
kinds would be beneficial anyway, so we could just do that. My whole
picture would then come to:
- Split the IO counters (PendingIOStats) into a separate file, with
perhaps inline functions to have a more elegant layer, somewhat
relevant for pgstat_report_fixed as well.
- Update the stats kinds to apply the diffs in the counters at flush
time.

>> Removing both the function calls and the extra counter incrementations
>> means to do the same thing as the WAL stats, where we have one
>> structure in charge of incrementing the counters, then the data diffs
>> are fed when flushing the entries in all the stats kinds.
>
> I think that's the wrong direction to go. Diffing stats is far from cheap and
> gets more expensive the more detail you add to stats.

Even if we only do the diffs calculations when flushing the pending
stats in the flush callbacks? That would mean to calculate the diffs
when stats reports are forced at transaction commit or when the delay
threshold is reached in non-force mode for pgstat_report_stat().

> EXPLAIN ANALYZE spends a large chunk of the time doing diffing of buffer
> access stats, for example. We need to work towards doing less of that stuff,
> not more.

Yep, agreed. I've seen people complaining about the overhead that can
happen in the EXPLAIN path, even for stuff aimed to be maintained
outside of core, that uses the backend hooks.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2025-09-03 06:25:33 Re: Orphan page in _bt_split
Previous Message Peter Eisentraut 2025-09-03 05:47:37 Solaris compiler status