Re: Get memory contexts of an arbitrary backend process

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, gkokolatos(at)protonmail(dot)com, kasahara(dot)tatsuhito(at)gmail(dot)com, craig(at)2ndquadrant(dot)com
Subject: Re: Get memory contexts of an arbitrary backend process
Date: 2021-03-24 15:17:53
Message-ID: 0b0657d5febd0e46565a6bc9c62ba3f6@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-03-23 17:24, Kyotaro Horiguchi wrote:

Thanks for reviewing and suggestions!

> At Mon, 22 Mar 2021 15:09:58 +0900, torikoshia
> <torikoshia(at)oss(dot)nttdata(dot)com> wrote in
>> >> If MemoryContextStatsPrint(), i.e. showing 100 children at most is
>> >> enough, this hard limit may be acceptable.
>> > Can't this number be passed via shared memory?
>>
>> The attached patch uses static shared memory to pass the number.
>
> "pg_print_backend_memory_contexts"
>
> That name looks like as if it returns the result as text when used on
> command-line. We could have pg_get_backend_memory_context(bool
> dump_to_log (or where to dump), int limit). Or couldn't we name it
> differently even in the ase we add a separate function?

Redefined pg_get_backend_memory_contexts() as
pg_get_backend_memory_contexts(pid, int max_children).

When pid equals 0, pg_get_backend_memory_contexts() prints local memory
contexts as original pg_get_backend_memory_contexts() does.
In this case, 'max_children' is ignored.

When 'pid' does not equal 0 and it is the PID of the client backend,
memory contexts are logged through elog().

>
> +/*
> + * MaxChildrenPerContext
> + * Max number of children to print per one parent context.
> + */
> +int *MaxChildrenPerContext = NULL;
>
> Perhaps it'd be better to have a struct even if it consists only of
> one member. (Aligned) C-int values are atomic so we can omit the
> McxtPrintLock. (I don't think it's a problem even if it is modifed
> while reading^^:)

Fixed them.

> + if(max_children <= 0)
> + {
> + ereport(WARNING,
> + (errmsg("%d is invalid value",
> max_children),
> + errhint("second parameter is the number
> of context and it must
> be set to a value greater than or equal to 1")));
>
> It's annoying to choose a number large enough when I want to dump
> children unlimitedly. Couldn't we use 0 to specify "unlimited"?

Modified as you suggested.

> + (errmsg("%d is invalid value",
> max_children),
> + errhint("second parameter is the number
> of context and it must
> be set to a value greater than or equal to 1")));
>
> For the main message, (I think) we usually spell the "%d is invalid
> value" as "maximum number of children must be positive" or such. For
> the hint, we don't need a copy of the primary section of the
> documentation here.

Modified it to "The maximum number of children must be greater than 0".

>
> I think we should ERROR out for invalid parameters, at least for
> max_children. I'm not sure about pid since we might call it based on
> pg_stat_activity..

Changed to ERROR out when the 'max_children' is less than 0.

Regarding pid, I left it untouched considering the consistency with
other
signal sending functions such as pg_cancel_backend().

> + if(!SendProcSignal(pid, PROCSIG_PRINT_MEMORY_CONTEXT,
> InvalidBackendId))
>
> We know the backendid of the process here.

Added it.

>
> + if (is_dst_stderr)
> + {
> + for (i = 0; i <= level; i++)
> + fprintf(stderr, " ");
>
> The fprintf path is used nowhere in the patch at all. It can be used
> while attaching debugger but I'm not sure we need that code. The
> footprint of this patch is largely shrinked by removing it.

According to the past discussion[1], people wanted MemoryContextStats
as it was, so I think it's better that MemoryContextStats can be used
as before.

>
> + strcat(truncated_ident, delimiter);
>
> strcpy is sufficient here. And we don't need the delimiter to be a
> variable. (we can copy a string literal into truncate_ident, then
> count the length of truncate_ident, instead of the delimiter
> variable.)

True.

> + $current_logfiles = slurp_file($node->data_dir .
> '/current_logfiles');
> ...
> +my $lfname = $current_logfiles;
> +$lfname =~ s/^stderr //;
> +chomp $lfname;
>
> $node->logfile is the current log file name.
>
> + 'target PID is not PostgreSQL server process');
>
> Maybe "check if PID check is working" or such? And, we can do
> something like the following to exercise in a more practical way.
>
> select pg_print_backend...(pid,) from pg_stat_activity where
> backend_type = 'checkpointer';

It seems better.

>> As documented, the current implementation allows that when multiple
>> pg_print_backend_memory_contexts() called in succession or
>> simultaneously, max_children can be the one of another
>> pg_print_backend_memory_contexts().
>> I had tried to avoid this by adding some state information and using
>> before_shmem_exit() in case of process termination for cleaning up the
>> state information as in the patch I presented earlier, but since
>> kill()
>> returns success before the dumper called signal handler, it seemed
>> there were times when we couldn't clean up the state.
>> Since this happens only when multiple
>> pg_print_backend_memory_contexts()
>> are called and their specified number of children are different, and
>> the
>> effect is just the not intended number of children to print, it might
>> be
>> acceptable.
>
> I see it as a non-issue. Even though the behavior looks somewhat
> strange, that usage is stranger than the behavior.

Thanks for your comments!

>> Or it might be better to wait for some seconds if num_chilren on
>> shared
>> memory is not the initialized value(meaning some other process is
>> requesting to print memory contexts).
>>
>> >> Only superusers can call pg_print_backend_memory_contexts().
>> > + /* Only allow superusers to signal superuser-owned backends. */
>> > + if (superuser_arg(proc->roleId) && !superuser())
>> > The patch seems to allow even non-superuser to request to print the
>> > memory
>> > contexts if the target backend is owned by non-superuser. Is this
>> > intentional?
>> > I think that only superuser should be allowed to execute
>> > pg_print_backend_memory_contexts() whoever owns the target backend.
>> > Because that function can cause lots of log messages.
>>
>> Thanks, it's not intentional, modified it.
>
> By the way we can send a procsig to other than a client backend. And
> most of the postgres processes are using the standard signal handler
> and intializes the procsig facility. So some of such kind of processes
> can dump the memory context information as-is. Otherwise we can add
> CHECK_FOR_INTERRUPT to appropriate place to allow that. I'm not sure
> how it is useful for other kind of processes, but it might be useful
> for autovacuum workers.

Yeah, I also think it's possible to get memory contexts other than the
client backend.
But I'm not sure whether people want to do it...

[1] https://www.postgresql.org/message-id/906.1513707472%40sss.pgh.pa.us

regards.

Attachment Content-Type Size
v3-0001-add-memorycontext-elog-print.patch text/x-diff 30.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-24 15:20:49 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Robert Haas 2021-03-24 15:11:35 Re: [HACKERS] Custom compression methods