Re: Get memory contexts of an arbitrary backend process

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Georgios Kokolatos <gkokolatos(at)protonmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get memory contexts of an arbitrary backend process
Date: 2020-12-10 01:48:00
Message-ID: 28e1feb71b94ac9918830bc4a0c1844f@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-12-04 19:16, torikoshia wrote:
> On 2020-12-03 10:36, Tom Lane wrote:
>> Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> writes:
>>> I'm starting to study how this feature behaves. At first, when I
>>> executed
>>> the following query, the function never returned. ISTM that since
>>> the autovacuum launcher cannot respond to the request of memory
>>> contexts dump, the function keeps waiting infinity. Is this a bug?
>>> Probably we should exclude non-backend proceses from the target
>>> processes to dump? Sorry if this was already discussed.
>>
>>> SELECT pg_get_backend_memory_contexts(pid) FROM
>>> pg_stat_activity;
>
> Thanks for trying it!
>
> It was not discussed explicitly, and I was going to do it later
> as commented.
>
>> + /* TODO: Check also whether backend or not. */
>
>>
>> FWIW, I think this patch is fundamentally unsafe. It's got a
>> lot of the same problems that I complained about w.r.t. the
>> nearby proposal to allow cross-backend stack trace dumping.
>> It does avoid the trap of thinking that it can do work in
>> a signal handler, but instead it supposes that it can do
>> work involving very high-level objects such as shared hash tables
>> in anyplace that might execute CHECK_FOR_INTERRUPTS. That's
>> never going to be safe: the only real expectation the system
>> has is that CHECK_FOR_INTERRUPTS is called at places where our
>> state is sane enough that a transaction abort can clean up.
>> Trying to do things like taking LWLocks is going to lead to
>> deadlocks or worse. We need not even get into the hard questions,
>> such as what happens when one process or the other exits
>> unexpectedly.
>
> Thanks for reviewing!
>
> I may misunderstand something, but the dumper works not at
> CHECK_FOR_INTERRUPTS but during the client read, i.e.,
> ProcessClientReadInterrupt().
>
> Is it also unsafe?
>
>
> BTW, since there was a comment that the shared hash table
> used too much memory, I'm now rewriting this patch not to use
> the shared hash table but a simpler static shared memory struct.

Attached a rewritten patch.

Accordingly, I also slightly modified the basic design as below.

---
# Communication flow between the dumper and the requester
- (1) When requesting memory context dumping, the dumper changes
the struct on the shared memory state from 'ACCEPTABLE' to
'REQUESTING'.
- (2) The dumper sends the signal to the dumper process and wait on
the latch.
- (3) When the dumper noticed the signal, it changes the state to
'DUMPING'.
- (4) When the dumper completes dumping, it changes the state to
'DONE' and set the latch.
- (5) The requestor reads the dump file and shows it to the user.
Finally, the requestor removes the dump file and reset the shared
memory state to 'ACCEPTABLE'.

# Query cancellation
- When the requestor cancels dumping, e.g. signaling using ctrl-C,
the requestor changes the state of the shared memory to 'CANCELING'.
- The dumper checks the state when it tries to change the state to
'DONE' at (4), and if the state is 'CANCELING', it initializes the
dump file and reset the shared memory state to 'ACCEPTABLE'.

# Cleanup dump file and the shared memory
- In the normal case, the dumper removes the dump file and resets
the shared memory entry as described in (5).
- When something like query cancellation or process termination
happens on the dumper after (1) and before (3), in other words,
the state is 'REQUESTING', the requestor does the cleanup.
- When something happens on the dumper or the requestor after (3)
and before (4), in other words, the state is 'DUMPING', the dumper
does the cleanup. Specifically, if the requestor cancels the query,
it just changes the state to 'CANCELING' and the dumper notices it
and cleans up things later.
OTOH, when the dumper fails to dump, it cleans up the dump file and
reset the shared memory state.
- When something happens on the requestor after (4), i.e., the state
is 'DONE', the requestor does the cleanup.
- In the case of receiving SIGKILL or power failure, all dump files
are removed in the crash recovery process.
---

>
>> I also find the idea that this should be the same SQL function
>> as pg_get_backend_memory_contexts to be a seriously bad decision.
>> That means that it's not possible to GRANT the right to examine
>> only your own process's memory --- with this proposal, that means
>> granting the right to inspect every other process as well.
>>
>> Beyond that, the fact that there's no way to restrict the capability
>> to just, say, other processes owned by the same user means that
>> it's not really safe to GRANT to non-superusers anyway. Even with
>> such a restriction added, things are problematic, since for example
>> it would be possible to inquire into the workings of a
>> security-definer function executing in another process that
>> nominally is owned by your user.
>
> I'm going to change the function name and restrict the executor to
> superusers. Is it enough?

In the attached patch, I changed the function name to
pg_get_target_backend_memory_contexts() for now.

Regards,

Attachment Content-Type Size
v5-0001-Add-pg_get_target_backend_memory_contexts-to-coll.patch text/x-diff 32.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Patrick Handja 2020-12-10 01:54:13 Re: Setof RangeType returns
Previous Message Bharath Rupireddy 2020-12-10 01:44:33 Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit