Re: Parent/child context relation in pg_get_backend_memory_contexts()

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parent/child context relation in pg_get_backend_memory_contexts()
Date: 2023-10-18 19:53:30
Message-ID: ZTA3umHA9cArtuGv@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Melih Mutlu (m(dot)melihmutlu(at)gmail(dot)com) wrote:
> Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, 16 Haz 2023 Cum, 17:03 tarihinde şunu
> yazdı:
>
> > With this change, here's a query to find how much space used by each
> > context including its children:
> >
> > > WITH RECURSIVE cte AS (
> > > SELECT id, total_bytes, id as root, name as root_name
> > > FROM memory_contexts
> > > UNION ALL
> > > SELECT r.id, r.total_bytes, cte.root, cte.root_name
> > > FROM memory_contexts r
> > > INNER JOIN cte ON r.parent_id = cte.id
> > > ),
> > > memory_contexts AS (
> > > SELECT * FROM pg_backend_memory_contexts
> > > )
> > > SELECT root as id, root_name as name, sum(total_bytes)
> > > FROM cte
> > > GROUP BY root, root_name
> > > ORDER BY sum DESC;
>
> Given that the above query to get total bytes including all children is
> still a complex one, I decided to add an additional info in
> pg_backend_memory_contexts.
> The new "path" field displays an integer array that consists of ids of all
> parents for the current context. This way it's easier to tell whether a
> context is a child of another context, and we don't need to use recursive
> queries to get this info.

Nice, this does seem quite useful.

> Here how pg_backend_memory_contexts would look like with this patch:
>
> postgres=# SELECT name, id, parent, parent_id, path
> FROM pg_backend_memory_contexts
> ORDER BY total_bytes DESC LIMIT 10;
> name | id | parent | parent_id | path
> -------------------------+-----+------------------+-----------+--------------
> CacheMemoryContext | 27 | TopMemoryContext | 0 | {0}
> Timezones | 124 | TopMemoryContext | 0 | {0}
> TopMemoryContext | 0 | | |
> MessageContext | 8 | TopMemoryContext | 0 | {0}
> WAL record construction | 118 | TopMemoryContext | 0 | {0}
> ExecutorState | 18 | PortalContext | 17 | {0,16,17}
> TupleSort main | 19 | ExecutorState | 18 | {0,16,17,18}
> TransactionAbortContext | 14 | TopMemoryContext | 0 | {0}
> smgr relation table | 10 | TopMemoryContext | 0 | {0}
> GUC hash table | 123 | GUCMemoryContext | 122 | {0,122}
> (10 rows)
>
> An example query to calculate the total_bytes including its children for a
> context (say CacheMemoryContext) would look like this:
>
> WITH contexts AS (
> SELECT * FROM pg_backend_memory_contexts
> )
> SELECT sum(total_bytes)
> FROM contexts
> WHERE ARRAY[(SELECT id FROM contexts WHERE name = 'CacheMemoryContext')] <@
> path;

I wonder if we should perhaps just include
"total_bytes_including_children" as another column? Certainly seems
like a very useful thing that folks would like to see. We could do that
either with C, or even something as simple as changing the view to do
something like:

WITH contexts AS MATERIALIZED (
SELECT * FROM pg_get_backend_memory_contexts()
)
SELECT
*,
coalesce
(
(
(SELECT sum(total_bytes) FROM contexts WHERE ARRAY[a.id] <@ path)
+ total_bytes
),
total_bytes
) AS total_bytes_including_children
FROM contexts a;

> We still need to use cte since ids are not persisted and might change in
> each run of pg_backend_memory_contexts. Materializing the result can
> prevent any inconsistencies due to id change. Also it can be even good for
> performance reasons as well.

I don't think we really want this to be materialized, do we? Where this
is particularly interesting is when it's being dumped to the log ( ...
though I wish we could do better than that and hope we do in the future)
while something is ongoing in a given backend and if we do that a few
times we are able to see what's changing in terms of allocations,
whereas if we materialized it (when? transaction start? first time
it's asked for?) then we'd only ever get the one view from whenever the
snapshot was taken.

> Any thoughts?

Generally +1 from me for working on improving this.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2023-10-18 20:11:59 Re: pg_dump needs SELECT privileges on irrelevant extension table
Previous Message Mikhail Gribkov 2023-10-18 19:36:58 Avoid race condition for event_triggers regress test