Re: Get memory contexts of an arbitrary backend process

From: Kasahara Tatsuhito <kasahara(dot)tatsuhito(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, tomas(dot)vondra(at)2ndquadrant(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Get memory contexts of an arbitrary backend process
Date: 2020-10-01 07:06:23
Message-ID: CAP0=ZVKRLH__vkc+4s+uo2+PKb79jvpyKy7HCSMLj0WNOurVxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Sep 25, 2020 at 4:28 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> Thanks for all your comments, I updated the patch.
Thanks for updating the patch.
I did a brief test and code review.

> I added a shared hash table consisted of minimal members
> mainly for managing whether the file is dumped or not.
> Some members like 'loc' seem useful in the future, but I
> haven't added them since it's not essential at this point.
Yes, that would be good.

+ /*
+ * Since we allow only one session can request to dump
memory context at
+ * the same time, check whether the dump files already exist.
+ */
+ while (stat(dumpfile, &stat_tmp) == 0 || stat(tmpfile, &stat_tmp) == 0)
+ {
+ pg_usleep(1000000L);
+ }

If pg_get_backend_memory_contexts() is executed by two or more
sessions at the same time, it cannot be run exclusively in this way.
Currently it seems to cause a crash when do it so.
This is easy to reproduce and can be done as follows.

[session-1]
BEGIN;
LOCK TABKE t1;
[Session-2]
BEGIN;
LOCK TABLE t1; <- waiting
[Session-3]
select * FROM pg_get_backend_memory_contexts(<pid of session-2>);
[Session-4]
select * FROM pg_get_backend_memory_contexts(<pid of session-2>);

If you issue commit or abort at session-1, you will get SEGV.

Instead of checking for the existence of the file, it might be better
to use a hash (mcxtdumpHash) entry with LWLock.

+ if (proc == NULL)
+ {
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server
process", dst_pid)));
+ return (Datum) 1;
+ }

Shouldn't it clear the hash entry before return?

+ /* Wait until target process finished dumping file. */
+ while (!entry->is_dumped)
+ {
+ CHECK_FOR_INTERRUPTS();
+ pg_usleep(10000L);
+ }

If the target is killed and exit before dumping the memory
information, you're in an infinite loop here.
So how about making sure that any process that has to stop before
doing a memory dump changes the status of the hash (entry->is_dumped)
before stopping and the caller detects it?
I'm not sure it's best or not, but you might want to use something
like the on_shmem_exit callback.

In the current design, if the caller stops processing before reading
the dumped file, you will have an orphaned file.
It looks like the following.

[session-1]
BEGIN;
LOCK TABKE t1;
[Session-2]
BEGIN;
LOCK TABLE t1; <- waiting
[Session-3]
select * FROM pg_get_backend_memory_contexts(<pid of session-2>);

If you cancel or terminate the session-3, then issue commit or abort
at session-1, you will get orphaned files in pg_memusage.

So if you allow only one session can request to dump file, it could
call pg_memusage_reset() before send the signal in this function.

Best regards,

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2020-10-01 07:09:03 Re: enable_incremental_sort changes query behavior
Previous Message Amit Kapila 2020-10-01 06:43:32 Re: Parallel copy