| From: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> |
|---|---|
| To: | Rahila Syed <rahilasyed90(at)gmail(dot)com> |
| Cc: | Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Enhancing Memory Context Statistics Reporting |
| Date: | 2025-12-08 01:26:33 |
| Message-ID: | 44a9d3857aaed49b818dc893dde9f6b4@oss.nttdata.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2025-11-28 18:22, Rahila Syed wrote:
Hi,
> I'm attaching the updated patches, which primarily include cleanup and
> have been rebased
> following the CFbot report.
Thanks for updating the patch!
I observed an assertion failure when forcing a timeout as follows:
```
$ psql
(pid:38587)=#
$ kill -s SIGSTOP 38587
$ psql
(pid:38618) =# select * from pg_get_process_memory_contexts(38587,
false);
name | ident | type | level | path | total_bytes |
total_nblocks | free_bytes | free_chunks | used_bytes | num_agg_contexts
--------+--------+--------+--------+--------+-------------+---------------+------------+-------------+------------+------------------
[NULL] | [NULL] | [NULL] | [NULL] | [NULL] | [NULL] |
[NULL] | [NULL] | [NULL] | [NULL] | [NULL]
(1 row)
Time: 5013.515 ms (00:05.014)
$ kill -s SIGCONT 38587
$ tail postgresql.log
TRAP: failed Assert("client_keys[MyProcNumber] != -1"), File:
"mcxtfuncs.c", Line: 881, PID: 38587
0 postgres 0x0000000104943400
ExceptionalCondition + 216
1 postgres 0x000000010480f738
ProcessGetMemoryContextInterrupt + 140
2 postgres 0x00000001046f2710
ProcessInterrupts + 3008
3 postgres 0x00000001046f1a78
ProcessClientReadInterrupt + 80
4 postgres 0x0000000104433994 secure_read
+ 404
5 postgres 0x00000001044411dc pq_recvbuf
+ 260
6 postgres 0x0000000104441088 pq_getbyte
+ 96
7 postgres 0x00000001046fa0fc
SocketBackend + 44
8 postgres 0x00000001046f6d3c ReadCommand
+ 44
9 postgres 0x00000001046f6284
PostgresMain + 2900
10 postgres 0x00000001046ed558
BackendInitialize + 0
11 postgres 0x00000001045c0a48
postmaster_child_launch + 456
12 postgres 0x00000001045c8520
BackendStartup + 304
13 postgres 0x00000001045c636c ServerLoop
+ 372
14 postgres 0x00000001045c4e24
PostmasterMain + 6448
15 postgres 0x0000000104445b4c main + 924
16 dyld 0x0000000188662b98 start +
6076
2025-12-08 07:35:32.608 JST [38540] LOG: 00000: client backend (PID
38587) was terminated by signal 6: Abort trap: 6
2025-12-08 07:35:32.608 JST [38540] LOCATION: LogChildExit,
postmaster.c:2872
```
Below are comments regarding the v42-0001 patch:
> In order to not block on busy processes, we have hardcoded
> the number of seconds during which to retry before timing out.
> In the case where no statistics are published within the set
> timeout, NULL is returned.
It might be good to also document in func-admin.sgml that the function
times out after 5 seconds when the target backend does not respond, and
that in such a case NULLs are returned.
+ * If DSA exists, created by another process requesting statistics,
attach
+ * to it. We expect the client process to create required DSA and
Dshash
+ * table.
+ */
+ if (MemoryStatsDsaArea == NULL)
+ MemoryStatsDsaArea =
GetNamedDSA("memory_context_statistics_dsa",
+ &found);
+
+ if (MemoryStatsDsHash == NULL)
+ MemoryStatsDsHash =
GetNamedDSHash("memory_context_statistics_dshash",
+ &memctx_dsh_params, &found);
From the comment, it sounded to me as if the client executing
pg_get_process_memory_contexts() might not create the DSA in some cases.
Is it correct to assume that such a situation can happen?
In [1], as a response to concerns about using DSA inside a CFI handler,
you wrote that “all the dynamic shared memory needed to store the
statistics is created and deleted in the client function”.
So I understood that it would never create the DSA inside the CFI
handler.
If that understanding is correct, perhaps the comment should be reworded
to make that clear.
+ context_id_lookup =
hash_create("pg_get_remote_backend_memory_contexts",
This appears to use the old function name. Should this be updated to
"pg_get_process_memory_contexts" instead?
--
Regards,
--
Atsushi Torikoshi
Seconded from NTT DATA Japan Corporation to SRA OSS K.K.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zhijie Hou (Fujitsu) | 2025-12-08 01:56:42 | RE: Newly created replication slot may be invalidated by checkpoint |
| Previous Message | Michael Paquier | 2025-12-08 01:25:45 | Re: Extended Statistics set/restore/clear functions. |