Re: Creating a function for exposing memory usage of backend process

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Creating a function for exposing memory usage of backend process
Date: 2020-07-10 08:32:23
Message-ID: 24ea05d7bbdb652feef50caee6fcfbf8@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-07-09 02:03, Andres Freund wrote:
> Hi,
>
> I think this is an incredibly useful feature.

Thanks for your kind comments and suggestion!

> On 2020-07-07 22:02:10 +0900, torikoshia wrote:
>> > There can be multiple memory contexts with the same name. So I'm afraid
>> > that it's difficult to identify the actual parent memory context from
>> > this
>> > "parent" column. This is ok when logging memory contexts by calling
>> > MemoryContextStats() via gdb. Because child memory contexts are printed
>> > just under their parent, with indents. But this doesn't work in the
>> > view.
>> > To identify the actual parent memory or calculate the memory contexts
>> > tree
>> > from the view, we might need to assign unique ID to each memory context
>> > and display it. But IMO this is overkill. So I'm fine with current
>> > "parent"
>> > column. Thought? Do you have any better idea?
>>
>> Indeed.
>> I also feel it's not usual to assign a unique ID, which
>> can vary every time the view displayed.
>
> Hm. I wonder if we just could include the address of the context
> itself. There might be reasons not to do so (e.g. security concerns
> about leaked pointers making attacks easier), but I think it's worth
> considering.

I tried exposing addresses of each context and their parent.
Attached a poc patch.

=# SELECT name, address, parent_address, total_bytes FROM
pg_backend_memory_contexts ;

name | address | parent_address | total_bytes
--------------------------+-----------+----------------+-------------
TopMemoryContext | 0x1280da0 | | 80800
TopTransactionContext | 0x1309040 | 0x1280da0 | 8192
Prepared Queries | 0x138a480 | 0x1280da0 | 16384
Type information cache | 0x134b8c0 | 0x1280da0 | 24624
...
CacheMemoryContext | 0x12cb390 | 0x1280da0 | 1048576
CachedPlanSource | 0x13c47f0 | 0x12cb390 | 4096
CachedPlanQuery | 0x13c9ae0 | 0x13c47f0 | 4096
CachedPlanSource | 0x13c7310 | 0x12cb390 | 4096
CachedPlanQuery | 0x13c1230 | 0x13c7310 | 4096
...

Now it's possible to identify the actual parent memory context even when
there are multiple memory contexts with the same name.

I'm not sure, but I'm also worrying about this might incur some security
related problems..

I'd like to hear more opinions about:

- whether information for identifying parent-child relation is necessary
or it's an overkill
- if this information is necessary, memory address is suitable or other
means like assigning unique numbers are required

>> +/*
>> + * PutMemoryContextsStatsTupleStore
>> + * One recursion level for pg_get_backend_memory_contexts.
>> + */
>> +static void
>> +PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
>> + TupleDesc tupdesc, MemoryContext context,
>> + MemoryContext parent, int level)
>> +{
>> +#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 9
>> + Datum values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
>> + bool nulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
>> + MemoryContextCounters stat;
>> + MemoryContext child;
>> + const char *name = context->name;
>> + const char *ident = context->ident;
>> +
>> + if (context == NULL)
>> + return;
>> +
>> + /*
>> + * To be consistent with logging output, we label dynahash contexts
>> + * with just the hash table name as with MemoryContextStatsPrint().
>> + */
>> + if (ident && strcmp(name, "dynahash") == 0)
>> + {
>> + name = ident;
>> + ident = NULL;
>> + }
>> +
>> + /* Examine the context itself */
>> + memset(&stat, 0, sizeof(stat));
>> + (*context->methods->stats) (context, NULL, (void *) &level, &stat);
>> +
>> + memset(values, 0, sizeof(values));
>> + memset(nulls, 0, sizeof(nulls));
>> +
>> + values[0] = CStringGetTextDatum(name);
>> +
>> + if (ident)
>> + {
>> + int idlen = strlen(ident);
>> + char clipped_ident[MEMORY_CONTEXT_IDENT_DISPLAY_SIZE];
>> +
>> + /*
>> + * Some identifiers such as SQL query string can be very long,
>> + * truncate oversize identifiers.
>> + */
>> + if (idlen >= MEMORY_CONTEXT_IDENT_DISPLAY_SIZE)
>> + idlen = pg_mbcliplen(ident, idlen,
>> MEMORY_CONTEXT_IDENT_DISPLAY_SIZE - 1);
>> +
>
> Why?

As described below[1], too long messages caused problems in the past and
now
MemoryContextStatsPrint() truncates ident, so I decided to truncate it
also
here.

Do you think it's not necessary here?

[1] https://www.postgresql.org/message-id/12319.1521999065@sss.pgh.pa.us

Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
0006-Adding-a-function-exposing-memory-usage-of-local-backend.patch text/x-diff 12.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-07-10 08:38:03 GSSENC'ed connection stalls while reconnection attempts.
Previous Message torikoshia 2020-07-10 08:30:22 Re: Creating a function for exposing memory usage of backend process