Re: Get memory contexts of an arbitrary backend process

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Georgios Kokolatos <gkokolatos(at)protonmail(dot)com>, Kasahara Tatsuhito <kasahara(dot)tatsuhito(at)gmail(dot)com>, craig(at)2ndquadrant(dot)com
Subject: Re: Get memory contexts of an arbitrary backend process
Date: 2021-03-17 13:24:27
Message-ID: 70ae4b79eb8b0dcf42161c80a00e3f22@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-03-05 14:22, Fujii Masao wrote:
> On 2021/03/04 18:32, torikoshia wrote:
>> On 2021-01-14 19:11, torikoshia wrote:
>>> Since pg_get_target_backend_memory_contexts() waits to dump memory
>>> and
>>> it could lead dead lock as below.
>>>
>>>   - session1
>>>   BEGIN; TRUNCATE t;
>>>
>>>   - session2
>>>   BEGIN; TRUNCATE t; -- wait
>>>
>>>   - session1
>>>   SELECT * FROM pg_get_target_backend_memory_contexts(<pid of session
>>> 2>); --wait
>>>
>>>
>>> Thanks for notifying me, Fujii-san.
>>>
>>>
>>> Attached v8 patch that prohibited calling the function inside
>>> transactions.
>>
>> Regrettably, this modification could not cope with the advisory lock
>> and
>> I haven't come up with a good way to deal with it.
>>
>> It seems to me that the architecture of the requestor waiting for the
>> dumper leads to this problem and complicates things.
>>
>>
>> Considering the discussion printing backtrace discussion[1], it seems
>> reasonable that the requestor just sends a signal and dumper dumps to
>> the log file.
>
> +1

Thanks!

I remade the patch and introduced a function
pg_print_backend_memory_contexts(PID) which prints the memory contexts
of
the specified PID to elog.

=# SELECT pg_print_backend_memory_contexts(450855);

** log output **
2021-03-17 15:21:01.942 JST [450855] LOG: Printing memory contexts of
PID 450855
2021-03-17 15:21:01.942 JST [450855] LOG: level: 0 TopMemoryContext:
68720 total in 5 blocks; 16312 free (15 chunks); 52408 used
2021-03-17 15:21:01.942 JST [450855] LOG: level: 1 Prepared Queries:
65536 total in 4 blocks; 35088 free (14 chunks); 30448 used
2021-03-17 15:21:01.942 JST [450855] LOG: level: 1 pgstat
TabStatusArray lookup hash table: 8192 total in 1 blocks; 1408 free (0
chunks); 6784 used
..(snip)..
2021-03-17 15:21:01.942 JST [450855] LOG: level: 2 CachedPlanSource:
4096 total in 3 blocks; 680 free (0 chunks); 3416 used: PREPARE hoge_200
AS SELECT * FROM pgbench_accounts WHERE aid =
1111111111111111111111111111111111111...
2021-03-17 15:21:01.942 JST [450855] LOG: level: 3 CachedPlanQuery:
4096 total in 3 blocks; 464 free (0 chunks); 3632 used
..(snip)..
2021-03-17 15:21:01.945 JST [450855] LOG: level: 1 Timezones: 104128
total in 2 blocks; 2584 free (0 chunks); 101544 used
2021-03-17 15:21:01.945 JST [450855] LOG: level: 1 ErrorContext: 8192
total in 1 blocks; 7928 free (5 chunks); 264 used
2021-03-17 15:21:01.945 JST [450855] LOG: Grand total: 2802080 bytes
in 1399 blocks; 480568 free (178 chunks); 2321512 used

As above, the output is almost the same as MemoryContextStatsPrint()
except for the way of expression of the level.
MemoryContextStatsPrint() uses indents, but
pg_print_backend_memory_contexts() writes it as "level: %d".

Since there was discussion about enlarging StringInfo may cause
errors on OOM[1], this patch calls elog for each context.

As with MemoryContextStatsPrint(), each context shows 100
children at most.
I once thought it should be configurable, but something like
pg_print_backend_memory_contexts(PID, num_children) needs to send
the 'num_children' from requestor to dumper and it seems to require
another infrastructure.
Creating a new GUC for this seems overkill.
If MemoryContextStatsPrint(), i.e. showing 100 children at most is
enough, this hard limit may be acceptable.

Only superusers can call pg_print_backend_memory_contexts().

I'm going to add documentation and regression tests.

Any thoughts?

[1]
https://www.postgresql.org/message-id/CAMsr%2BYGh%2Bsso5N6Q%2BFmYHLWC%3DBPCzA%2B5GbhYZSGruj2d0c7Vvg%40mail.gmail.com
"r_d/strengthen_perf/print_memcon.md" 110L, 5642C written

Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v1-0001-Add-memorycontext-elog-print.patch text/x-diff 20.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2021-03-17 13:24:55 Re: pgsql: Add libpq pipeline mode support to pgbench
Previous Message Thomas Munro 2021-03-17 12:17:52 Re: Assertion failure with barriers in parallel hash join