Re: Enhancing Memory Context Statistics Reporting

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Re: Enhancing Memory Context Statistics Reporting
Date: 2025-08-13 22:35:59
Message-ID: CAH2L28sF02nZj5vEL49EsTx+TfYpB3dFgShgn0=W1V-uFZjejw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Torikoshia,

Thank you for reviewing the patch.

>
> This function is added at the end of Table "9.96. Server Signaling
> Functions", but since pg_get_process_memory_contexts outputs essentially
> the same information as pg_log_backend_memory_contexts, it might be
> better to place them next to each other in the table.
>
>
The idea was to place the new addition at the end of the table instead of
in the middle.
I’m fine with putting them together, though. I’ll do that in the next
version unless there’s a
reason not to.

>
> > + <parameter>stats_timestamp</parameter> <type>timestamptz</type> )
>
> > +typedef struct MemoryStatsDSHashEntry
> > +{
> > + char key[64];
> > + ConditionVariable memcxt_cv;
> > + int proc_id;
> > + int total_stats;
> > + bool summary;
> > + dsa_pointer memstats_dsa_pointer;
> > + TimestampTz stats_timestamp;
> > +} MemoryStatsDSHashEntry;
>
> stats_timestamp appears only in the two places below in the patch, but
> it does not seem to be actually output.
> Is this column unnecessary?
>
>
Thank you for pointing this out. This is removed in the attached patch, as
it was a
remnant from the previous design. As old statistics are discarded in the
new design,
a timestamp field is not needed anymore.

>
> Specifying 0 for timeout causes a crash:
> Should 0 be handled safely and treated as “no timeout”, or rejected as
> an error?
>

Good catch.
The crash has been resolved in the attached patch. It was caused by a
missing
ConditionVariableCancelSleep() call when exiting without statistics due to
a timeout value of 0.
A 0 timeout means that statistics should only be retrieved if they are
immediately available,
without waiting. We could exit with a warning/error saying "too low
timeout", but I think it's worthwhile
to try fetching the statistics if possible.

Similarly, specifying a negative value for timeout still works:
>
> =# select * from pg_get_process_memory_contexts(30590, true, -10);
>
> It might be better to reject negative values similar to
> pg_terminate_backend().
>
>
>
Fixed as suggested by you in the attached patch.
Currently, negative values are interpreted as an indefinite wait for
statistics.
This could cause the client to hang if the server process exits without
providing statistics.
To avoid this, it would be better to exit after displaying a warning when
the user specifies
negative timeouts.

>
> > + /* Retreive the client key fo publishing statistics */
>
> fo -> for?
>

Fixed.

> + */
> > +#define MEMSTATS_WAIT_TIMEOUT 100
>
> MEMSTATS_WAIT_TIMEOUT is defined, but it doesn’t seem to be used.
>
>
This is removed now as it was a leftover from the previous design.

The attached patch also fixes an assertion failure I observed when a client
times out
before the last requested process can publish its statistics. A client
frees the memory
reserved for storing the statistics when it exits the function after
timeout. Since a
server process was notified, it might attempt to read the same client entry
and access the dsa
memory reserved for statistics resulting in the assertion
failure. I resolved this by including a check for this scenario and then
exiting the handler
function accordingly.

Thank you,
Rahila Syed

Attachment Content-Type Size
v33-0001-Add-pg_get_process_memory_context-function.patch application/octet-stream 61.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2025-08-13 23:04:18 [PATCH] Silence Valgrind about SelectConfigFiles()
Previous Message Andres Freund 2025-08-13 22:32:13 Re: index prefetching