Re: Get memory contexts of an arbitrary backend process

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: torikoshia(at)oss(dot)nttdata(dot)com
Cc: masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, gkokolatos(at)protonmail(dot)com, kasahara(dot)tatsuhito(at)gmail(dot)com, craig(at)2ndquadrant(dot)com
Subject: Re: Get memory contexts of an arbitrary backend process
Date: 2021-03-23 08:24:33
Message-ID: 20210323.172433.1038800347387944240.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 22 Mar 2021 15:09:58 +0900, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote in
> >> If MemoryContextStatsPrint(), i.e. showing 100 children at most is
> >> enough, this hard limit may be acceptable.
> > Can't this number be passed via shared memory?
>
> The attached patch uses static shared memory to pass the number.

"pg_print_backend_memory_contexts"

That name looks like as if it returns the result as text when used on
command-line. We could have pg_get_backend_memory_context(bool
dump_to_log (or where to dump), int limit). Or couldn't we name it
differently even in the ase we add a separate function?

+/*
+ * MaxChildrenPerContext
+ * Max number of children to print per one parent context.
+ */
+int *MaxChildrenPerContext = NULL;

Perhaps it'd be better to have a struct even if it consists only of
one member. (Aligned) C-int values are atomic so we can omit the
McxtPrintLock. (I don't think it's a problem even if it is modifed
while reading^^:)

+ if(max_children <= 0)
+ {
+ ereport(WARNING,
+ (errmsg("%d is invalid value", max_children),
+ errhint("second parameter is the number of context and it must be set to a value greater than or equal to 1")));

It's annoying to choose a number large enough when I want to dump
children unlimitedly. Couldn't we use 0 to specify "unlimited"?

+ (errmsg("%d is invalid value", max_children),
+ errhint("second parameter is the number of context and it must be set to a value greater than or equal to 1")));

For the main message, (I think) we usually spell the "%d is invalid
value" as "maximum number of children must be positive" or such. For
the hint, we don't need a copy of the primary section of the
documentation here.

I think we should ERROR out for invalid parameters, at least for
max_children. I'm not sure about pid since we might call it based on
pg_stat_activity..

+ if(!SendProcSignal(pid, PROCSIG_PRINT_MEMORY_CONTEXT, InvalidBackendId))

We know the backendid of the process here.

+ if (is_dst_stderr)
+ {
+ for (i = 0; i <= level; i++)
+ fprintf(stderr, " ");

The fprintf path is used nowhere in the patch at all. It can be used
while attaching debugger but I'm not sure we need that code. The
footprint of this patch is largely shrinked by removing it.

+ strcat(truncated_ident, delimiter);

strcpy is sufficient here. And we don't need the delimiter to be a
variable. (we can copy a string literal into truncate_ident, then
count the length of truncate_ident, instead of the delimiter
variable.)

+ $current_logfiles = slurp_file($node->data_dir . '/current_logfiles');
...
+my $lfname = $current_logfiles;
+$lfname =~ s/^stderr //;
+chomp $lfname;

$node->logfile is the current log file name.

+ 'target PID is not PostgreSQL server process');

Maybe "check if PID check is working" or such? And, we can do
something like the following to exercise in a more practical way.

select pg_print_backend...(pid,) from pg_stat_activity where backend_type = 'checkpointer';

> As documented, the current implementation allows that when multiple
> pg_print_backend_memory_contexts() called in succession or
> simultaneously, max_children can be the one of another
> pg_print_backend_memory_contexts().
> I had tried to avoid this by adding some state information and using
> before_shmem_exit() in case of process termination for cleaning up the
> state information as in the patch I presented earlier, but since
> kill()
> returns success before the dumper called signal handler, it seemed
> there were times when we couldn't clean up the state.
> Since this happens only when multiple
> pg_print_backend_memory_contexts()
> are called and their specified number of children are different, and
> the
> effect is just the not intended number of children to print, it might
> be
> acceptable.

I see it as a non-issue. Even though the behavior looks somewhat
strange, that usage is stranger than the behavior.

> Or it might be better to wait for some seconds if num_chilren on
> shared
> memory is not the initialized value(meaning some other process is
> requesting to print memory contexts).
>
> >> Only superusers can call pg_print_backend_memory_contexts().
> > + /* Only allow superusers to signal superuser-owned backends. */
> > + if (superuser_arg(proc->roleId) && !superuser())
> > The patch seems to allow even non-superuser to request to print the
> > memory
> > contexts if the target backend is owned by non-superuser. Is this
> > intentional?
> > I think that only superuser should be allowed to execute
> > pg_print_backend_memory_contexts() whoever owns the target backend.
> > Because that function can cause lots of log messages.
>
> Thanks, it's not intentional, modified it.

By the way we can send a procsig to other than a client backend. And
most of the postgres processes are using the standard signal handler
and intializes the procsig facility. So some of such kind of processes
can dump the memory context information as-is. Otherwise we can add
CHECK_FOR_INTERRUPT to appropriate place to allow that. I'm not sure
how it is useful for other kind of processes, but it might be useful
for autovacuum workers.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2021-03-23 08:49:57 Re: tool to migrate database
Previous Message kuroda.hayato@fujitsu.com 2021-03-23 08:09:07 RE: [PATCH] Tracking statements entry timestamp in pg_stat_statements