Re: Parent/child context relation in pg_get_backend_memory_contexts()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parent/child context relation in pg_get_backend_memory_contexts()
Date: 2023-10-12 16:23:09
Message-ID: 20231012162309.4wmc2nb5tkt6pzml@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-08-04 21:16:49 +0300, Melih Mutlu wrote:
> Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, 16 Haz 2023 Cum, 17:03 tarihinde şunu
> yazdı:
>
> > With this change, here's a query to find how much space used by each
> > context including its children:
> >
> > > WITH RECURSIVE cte AS (
> > > SELECT id, total_bytes, id as root, name as root_name
> > > FROM memory_contexts
> > > UNION ALL
> > > SELECT r.id, r.total_bytes, cte.root, cte.root_name
> > > FROM memory_contexts r
> > > INNER JOIN cte ON r.parent_id = cte.id
> > > ),
> > > memory_contexts AS (
> > > SELECT * FROM pg_backend_memory_contexts
> > > )
> > > SELECT root as id, root_name as name, sum(total_bytes)
> > > FROM cte
> > > GROUP BY root, root_name
> > > ORDER BY sum DESC;
> >
>
> Given that the above query to get total bytes including all children is
> still a complex one, I decided to add an additional info in
> pg_backend_memory_contexts.
> The new "path" field displays an integer array that consists of ids of all
> parents for the current context. This way it's easier to tell whether a
> context is a child of another context, and we don't need to use recursive
> queries to get this info.

I think that does make it a good bit easier. Both to understand and to use.

> Here how pg_backend_memory_contexts would look like with this patch:
>
> postgres=# SELECT name, id, parent, parent_id, path
> FROM pg_backend_memory_contexts
> ORDER BY total_bytes DESC LIMIT 10;
> name | id | parent | parent_id | path
> -------------------------+-----+------------------+-----------+--------------
> CacheMemoryContext | 27 | TopMemoryContext | 0 | {0}
> Timezones | 124 | TopMemoryContext | 0 | {0}
> TopMemoryContext | 0 | | |
> MessageContext | 8 | TopMemoryContext | 0 | {0}
> WAL record construction | 118 | TopMemoryContext | 0 | {0}
> ExecutorState | 18 | PortalContext | 17 | {0,16,17}
> TupleSort main | 19 | ExecutorState | 18 | {0,16,17,18}
> TransactionAbortContext | 14 | TopMemoryContext | 0 | {0}
> smgr relation table | 10 | TopMemoryContext | 0 | {0}
> GUC hash table | 123 | GUCMemoryContext | 122 | {0,122}
> (10 rows)

Would we still need the parent_id column?

> +
> + <row>
> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>context_id</structfield> <type>int4</type>
> + </para>
> + <para>
> + Current context id
> + </para></entry>
> + </row>

I think the docs here need to warn that the id is ephemeral and will likely
differ in the next invocation.

> + <row>
> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>parent_id</structfield> <type>int4</type>
> + </para>
> + <para>
> + Parent context id
> + </para></entry>
> + </row>
> +
> + <row>
> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>path</structfield> <type>int4</type>
> + </para>
> + <para>
> + Path to reach the current context from TopMemoryContext
> + </para></entry>
> + </row>

Perhaps we should include some hint here how it could be used?

> </tbody>
> </tgroup>
> </table>
> diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
> index 92ca5b2f72..81cb35dd47 100644
> --- a/src/backend/utils/adt/mcxtfuncs.c
> +++ b/src/backend/utils/adt/mcxtfuncs.c
> @@ -20,6 +20,7 @@
> #include "mb/pg_wchar.h"
> #include "storage/proc.h"
> #include "storage/procarray.h"
> +#include "utils/array.h"
> #include "utils/builtins.h"
>
> /* ----------
> @@ -28,6 +29,8 @@
> */
> #define MEMORY_CONTEXT_IDENT_DISPLAY_SIZE 1024
>
> +static Datum convert_path_to_datum(List *path);
> +
> /*
> * PutMemoryContextsStatsTupleStore
> * One recursion level for pg_get_backend_memory_contexts.
> @@ -35,9 +38,10 @@
> static void
> PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
> TupleDesc tupdesc, MemoryContext context,
> - const char *parent, int level)
> + const char *parent, int level, int *context_id,
> + int parent_id, List *path)
> {
> -#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 9
> +#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 12
>
> Datum values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
> bool nulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
> @@ -45,6 +49,7 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
> MemoryContext child;
> const char *name;
> const char *ident;
> + int current_context_id = (*context_id)++;
>
> Assert(MemoryContextIsValid(context));
>
> @@ -103,13 +108,29 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
> values[6] = Int64GetDatum(stat.freespace);
> values[7] = Int64GetDatum(stat.freechunks);
> values[8] = Int64GetDatum(stat.totalspace - stat.freespace);
> + values[9] = Int32GetDatum(current_context_id);
> +
> + if(parent_id < 0)
> + /* TopMemoryContext has no parent context */
> + nulls[10] = true;
> + else
> + values[10] = Int32GetDatum(parent_id);
> +
> + if (path == NIL)
> + nulls[11] = true;
> + else
> + values[11] = convert_path_to_datum(path);
> +
> tuplestore_putvalues(tupstore, tupdesc, values, nulls);
>
> + path = lappend_int(path, current_context_id);
> for (child = context->firstchild; child != NULL; child = child->nextchild)
> {
> - PutMemoryContextsStatsTupleStore(tupstore, tupdesc,
> - child, name, level + 1);
> + PutMemoryContextsStatsTupleStore(tupstore, tupdesc, child, name,
> + level+1, context_id,
> + current_context_id, path);
> }
> + path = list_delete_last(path);
> }
>
> /*
> @@ -120,10 +141,15 @@ Datum
> pg_get_backend_memory_contexts(PG_FUNCTION_ARGS)
> {
> ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> + int context_id = 0;
> + List *path = NIL;
> +
> + elog(LOG, "pg_get_backend_memory_contexts called");
>
> InitMaterializedSRF(fcinfo, 0);
> PutMemoryContextsStatsTupleStore(rsinfo->setResult, rsinfo->setDesc,
> - TopMemoryContext, NULL, 0);
> + TopMemoryContext, NULL, 0, &context_id,
> + -1, path);
>
> return (Datum) 0;
> }
> @@ -193,3 +219,26 @@ pg_log_backend_memory_contexts(PG_FUNCTION_ARGS)
>
> PG_RETURN_BOOL(true);
> }
> +
> +/*
> + * Convert a list of context ids to a int[] Datum
> + */
> +static Datum
> +convert_path_to_datum(List *path)
> +{
> + Datum *datum_array;
> + int length;
> + ArrayType *result_array;
> + ListCell *lc;
> +
> + length = list_length(path);
> + datum_array = (Datum *) palloc(length * sizeof(Datum));
> + length = 0;
> + foreach(lc, path)
> + {
> + datum_array[length++] = Int32GetDatum((int) lfirst_int(lc));

The "(int)" in front of lfirst_int() seems redundant?

I think it'd be good to have some minimal test for this. E.g. checking that
there's multiple contexts below cache memory context or such.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-10-12 16:24:19 Re: Performance degradation on concurrent COPY into a single relation in PG16.
Previous Message Peter Geoghegan 2023-10-12 16:00:29 Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound