Re: Proposal: Add a callback data parameter to GetNamedDSMSegment

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Proposal: Add a callback data parameter to GetNamedDSMSegment
Date: 2025-12-12 18:48:52
Message-ID: CAA5RZ0t7CrBbqW8uqUyo6bZLmMuk4Yyeobm9tEB_nPMiE4o6EQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> My trivial example for it would be persistent statistics: when I want
> to collect some information, save it to disk before shutdown, and on
> the next startup, I want to load the previous state before continuing
> collecting. pg_track_optimizer seems to do this. There are also
> definitely other reasons.

From what I can tell pg_track_optimizer has SQL functions that
flush the data to disk and loads it back. This is because dsm and
by extension dsa and dshash cannot be accessed by postmaster.

A better way to deal with this is via pluggable stats, which can be
written to disk. There is also on-going work, close to commitable,
that will provide callbacks to allow pluggable stats to
serialize/deserialize extra data that does not fit in stats ( i.e. data
stored in dsa or some dshash table ) [0].

(pgstat.c write and read the stats during checkpoint and startup,
to overcome the postmaster accessing dsm limitation).

>>> If the initialization callback function needed the name, it could be
>>> provided via the "void *" callback argument, right? I'm not following why
>>> we need to provide it separately.
>>
>> While it's true it can be passed as extra data, it is less error-prone
>> as we guarantee the real name of the segment is made available to
>> the callback. Also a caller to GetNamedDSMSegment does not need to
>> pass the name twice, as the name and as extra data. The most common
>> case I would think is using the segment name as the tranche name when
>> initializing a lwlock.

> But... they can just pass that in the "void *" argument. I'm pretty firmly
> -1 for adding more than the one callback argument here.

Ok. I was going back-forth on this, but at the end I also could
not convince myself that this is required ( at least not yet ). The main reason
I had was that an extension may want to validate that the callback is being
called during the initialization of the segment they expect, and also that
is a small convenience to simply refer to the name of the segment when
creating a tranche. But, I will agree with you that these reasons are
not justified.

As far as testing, I did not think it's worth it since in the cases
out there now
a NULL void * will result in an error when calling LWLockNewTrancheId.

See v5.

--
Sami Imseih
Amazon Web Services (AWS)

[0] https://www.postgresql.org/message-id/flat/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ(at)mail(dot)gmail(dot)com

Attachment Content-Type Size
v5-0001-Add-init_callback_arg-parameter-to-GetNamedDSMSeg.patch application/octet-stream 8.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2025-12-12 19:03:55 Re: Import Statistics in postgres_fdw before resorting to sampling.
Previous Message Tom Lane 2025-12-12 18:37:23 Re: [PATCH] pg_bsd_indent: improve formatting of multiline comments