Re: [Proposal] Adding callback support for custom statistics kinds

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Adding callback support for custom statistics kinds
Date: 2025-12-12 09:39:49
Message-ID: aTvi5ceF-pB1IbDR@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 12, 2025 at 11:58:54AM +0900, Michael Paquier wrote:
> FWIW, I have begun putting my hands on your patch, editing it at some
> degree. I am not sure that I will be able to finish that today, but
> I'm working towards getting something done.

Well, I have been able to do enough progress to have something to
share, and I'm getting pretty happy about how things are shaping. As
you will notice, I have edited quite a few things.. In details:
- Less fwrite() and fread(), more read_chunk() and write_chunk(). We
are exposing these APIs, let's use them.
- Comments, much more comments and documentation.
- The callbacks are renamed, to be more generic: "finish" for the
end-of-operation actions and to/from_serialized_data.
- The format of the extra data in the main pgstats file and the
secondary file was a bit strange. Mainly, why adding the length to
the main file and not the secondary file? I have extended that a
little bit:
-- Addition of a magic number in the main file, to provide an extra
layer of safety in the read callback, letting the callback know that
it needs to read some data.
-- The offset of the secondary file follows immeditely.
-- The secondary file includes at the offset a copy of the hash key,
the description length, and the description.
- Reorganization of the read/write flow for the callbacks in the
modules, tracking the offset at write more precisely. The handling of
the empty descriptions becomes simpler than what you have proposed
previously.

This way, we can make sure that the main stats file is OK with the
magic number, and we have a sanity check in the secondary file based
on the hash key whose copy is in the main stats file.

Regarding the error state that could be sent to the "end" callback, I
think that you are right. We are not gaining much with that as by
design we are already pretty loose on the write side, hoping for the
best, relying on the read side to enforce all sanity checks. So a
status in the "from" callback sounds like a good enough balance.

At the end of the day, I'm feeling pretty much OK with the core
changes and the layer we have here. The module changes need an extra
round of lookup (did as well some tests with corrupted and empty
secondary files to test the stability at reload), and I'm pretty tired
so I may have missed something there. The patch needs to be split in
two parts: one for the backend changes, and one for the module itself.
The backend changes are feeling pretty good, the module changes feel
better.
--
Michael

Attachment Content-Type Size
v7-0001-Allow-cumulative-statistics-to-serialize-auxiliar.patch text/x-diff 37.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Amit Kapila 2025-12-12 09:33:47 Re: Proposal: Conflict log history table for Logical Replication