Re: Get memory contexts of an arbitrary backend process

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Kasahara Tatsuhito <kasahara(dot)tatsuhito(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, tomas(dot)vondra(at)2ndquadrant(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Get memory contexts of an arbitrary backend process
Date: 2020-10-22 12:32:00
Message-ID: e609bb1609d06e674d39f101c79a4c23@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, Oct 1, 2020 at 4:06 PM Kasahara Tatsuhito
> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> Hi,
>
> On Fri, Sep 25, 2020 at 4:28 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
> > Thanks for all your comments, I updated the patch.
> Thanks for updating the patch.
> I did a brief test and code review.

Thanks for your tests and review!

> > I added a shared hash table consisted of minimal members
> > mainly for managing whether the file is dumped or not.
> > Some members like 'loc' seem useful in the future, but I
> > haven't added them since it's not essential at this point.
> Yes, that would be good.
>
> +        /*
> +         * Since we allow only one session can request to dump
> memory context at
> +         * the same time, check whether the dump files already exist.
> +         */
> +        while (stat(dumpfile, &stat_tmp) == 0 || stat(tmpfile,
> &stat_tmp) == 0)
> +        {
> +            pg_usleep(1000000L);
> +        }
>
> If pg_get_backend_memory_contexts() is executed by two or more
> sessions at the same time, it cannot be run exclusively in this way.
> Currently it seems to cause a crash when do it so.
> This is easy to reproduce and can be done as follows.
>
> [session-1]
> BEGIN;
> LOCK TABKE t1;
>   [Session-2]
>   BEGIN;
>   LOCK TABLE t1; <- waiting
>     [Session-3]
>     select * FROM pg_get_backend_memory_contexts(<pid of session-2>);
>       [Session-4]
>       select * FROM pg_get_backend_memory_contexts(<pid of session-2>);
>
> If you issue commit or abort at session-1, you will get SEGV.
>
> Instead of checking for the existence of the file, it might be better
> to use a hash (mcxtdumpHash) entry with LWLock.

Thanks!
Added a LWLock and changed the way from checking the file existence
to finding the hash entry.

> +        if (proc == NULL)
> +        {
> +            ereport(WARNING,
> +                    (errmsg("PID %d is not a PostgreSQL server
> process", dst_pid)));
> +            return (Datum) 1;
> +        }
>
> Shouldn't it clear the hash entry before return?

Yeah. added codes for removing the entry.

>
> +        /* Wait until target process finished dumping file. */
> +        while (!entry->is_dumped)
> +        {
> +            CHECK_FOR_INTERRUPTS();
> +            pg_usleep(10000L);
> +        }
>
> If the target is killed and exit before dumping the memory
> information, you're in an infinite loop here.
> So how about making sure that any process that has to stop before
> doing a memory dump changes the status of the hash (entry->is_dumped)
> before stopping and the caller detects it?
> I'm not sure it's best or not, but you might want to use something
> like the on_shmem_exit callback.

Thanks for your idea!
Added a callback to change the status of the hash table entry.

Although I think it's necessary to remove this callback when it finished
processing memory dumping, on_shmem_exit() does not seem to have such
a function.
I used before_shmem_exit() since it has a corresponding function to
remove registered callback.
If it's inappropriate, I'm going to add a function removing the
registered callback of on_shmem_exit().

> In the current design, if the caller stops processing before reading
> the dumped file, you will have an orphaned file.
> It looks like the following.
>
> [session-1]
> BEGIN;
> LOCK TABKE t1;
>   [Session-2]
>   BEGIN;
>   LOCK TABLE t1; <- waiting
>     [Session-3]
>     select * FROM pg_get_backend_memory_contexts(<pid of session-2>);
>
> If you cancel or terminate the session-3, then issue commit or abort
> at session-1, you will get orphaned files in pg_memusage.
>
> So if you allow only one session can request to dump file, it could
> call pg_memusage_reset() before send the signal in this function.

Although I'm going to allow only one session per one target process,
I'd like to allow running multiple pg_get_backend_memory_contexts()
which target process is different.

Instead of calling pg_memusage_reset(), I added a callback for
cleaning up orphaned files and the elements of the hash table
using before_shmem_exit() through PG_ENSURE_ERROR_CLEANUP() and
PG_END_ENSURE_ERROR_CLEANUP().

I chose PG_ENSURE_ERROR_CLEANUP() and PG_END_ENSURE_ERROR_CLEANUP()
here since it can handle not only termination but also cancellation.

Any thoughts?

--
Atsushi Torikoshi

Attachment Content-Type Size
0003-Enabled-pg_get_backend_memory_contexts-to-collect.patch text/x-diff 29.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-10-22 12:51:20 Re: new heapcheck contrib module
Previous Message Simon Riggs 2020-10-22 12:29:53 Re: should INSERT SELECT use a BulkInsertState?