Re: Add tracking of backend memory allocated to pg_stat_activity

From: Andres Freund <andres(at)anarazel(dot)de>
To: Reid Thompson <reid(dot)thompson(at)crunchydata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Subject: Re: Add tracking of backend memory allocated to pg_stat_activity
Date: 2022-11-05 02:41:46
Message-ID: 20221105024146.xxlbtsxh2niyz2fu@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-04 08:56:13 -0400, Reid Thompson wrote:
> From a8de5d29c0c6f10962181926a49ad4fec1e52bd1 Mon Sep 17 00:00:00 2001
> From: Reid Thompson <jreidthompson(at)nc(dot)rr(dot)com>
> Date: Thu, 11 Aug 2022 12:01:25 -0400
> Subject: [PATCH] Add tracking of backend memory allocated to pg_stat_activity
>
> This new field displays the current bytes of memory allocated to the
> backend process. It is updated as memory for the process is
> malloc'd/free'd. Memory allocated to items on the freelist is included in
> the displayed value. Dynamic shared memory allocations are included
> only in the value displayed for the backend that created them, they are
> not included in the value for backends that are attached to them to
> avoid double counting. On occasion, orphaned memory segments may be
> cleaned up on postmaster startup. This may result in decreasing the sum
> without a prior increment. We limit the floor of backend_mem_allocated
> to zero. Updated pg_stat_activity documentation for the new column.

I'm not convinced that counting DSM values this way is quite right. There are
a few uses of DSMs that track shared resources, with the biggest likely being
the stats for relations etc. I suspect that tracking that via backend memory
usage will be quite confusing, because fairly random backends that had to grow
the shared state end up being blamed with the memory usage in perpituity - and
then suddenly that memory usage will vanish when that backend exits, despite
the memory continuing to exist.

> @@ -734,6 +747,7 @@ AllocSetAlloc(MemoryContext context, Size size)
> return NULL;
>
> context->mem_allocated += blksize;
> + pgstat_report_backend_allocated_bytes_increase(blksize);
>
> block->aset = set;
> block->freeptr = block->endptr = ((char *) block) + blksize;
> @@ -944,6 +958,7 @@ AllocSetAlloc(MemoryContext context, Size size)
> return NULL;
>
> context->mem_allocated += blksize;
> + pgstat_report_backend_allocated_bytes_increase(blksize);
>
> block->aset = set;
> block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ;
> @@ -1043,6 +1058,7 @@ AllocSetFree(void *pointer)
> block->next->prev = block->prev;
>
> set->header.mem_allocated -= block->endptr - ((char *) block);
> + pgstat_report_backend_allocated_bytes_decrease(block->endptr - ((char *) block));
>
> #ifdef CLOBBER_FREED_MEMORY
> wipe_mem(block, block->freeptr - ((char *) block));

I suspect this will be noticable cost-wise. Even though these paths aren't the
hottest memory allocation paths, by nature of going down into malloc, adding
an external function call that then does a bunch of branching etc. seems
likely to add up to some. See below for how I think we can deal with that...

> +
> +/* --------
> + * pgstat_report_backend_allocated_bytes_increase() -
> + *
> + * Called to report increase in memory allocated for this backend
> + * --------
> + */
> +void
> +pgstat_report_backend_allocated_bytes_increase(uint64 allocation)
> +{
> + volatile PgBackendStatus *beentry = MyBEEntry;
> +
> + if (!beentry || !pgstat_track_activities)
> + {
> + /*
> + * Account for memory before pgstats is initialized. This will be
> + * migrated to pgstats on initialization.
> + */
> + backend_allocated_bytes += allocation;
> +
> + return;
> + }
> +
> + /*
> + * Update my status entry, following the protocol of bumping
> + * st_changecount before and after. We use a volatile pointer here to
> + * ensure the compiler doesn't try to get cute.
> + */
> + PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
> + beentry->backend_allocated_bytes += allocation;
> + PGSTAT_END_WRITE_ACTIVITY(beentry);
> +}

This is quite a few branches, including write/read barriers.

It doesn't really make sense to use the PGSTAT_BEGIN_WRITE_ACTIVITY() pattern
here - you're just updating a single value, there's nothing to be gained by
it. The point of PGSTAT_BEGIN_*ACTIVITY() stuff is to allow readers to get a
consistent view of multiple values - but there aren't multiple values here!

To avoid the overhead of checking (!beentry || !pgstat_track_activities) and
the external function call, I think you'd be best off copying the trickery I
introduced for pgstat_report_wait_start(), in 225a22b19ed.

I.e. make pgstat_report_backend_allocated_bytes_increase() a static inline
function that unconditionally updates something like
*my_backend_allocated_memory. To deal with the case of (!beentry ||
!pgstat_track_activities), that variable initially points to some backend
local state and is set to the shared state in pgstat_bestart().

This additionally has the nice benefit that you can track memory usage from
before pgstat_bestart(), it'll be in the local variable.

> +void
> +pgstat_report_backend_allocated_bytes_decrease(uint64 deallocation)
> +{
> + volatile PgBackendStatus *beentry = MyBEEntry;
> +
> + /*
> + * Cases may occur where shared memory from a previous postmaster
> + * invocation still exist. These are cleaned up at startup by
> + * dsm_cleanup_using_control_segment. Limit decreasing memory allocated to
> + * zero in case no corresponding prior increase exists or decrease has
> + * already been accounted for.
> + */

I don't really follow - postmaster won't ever have a backend status array, so
how would they be tracked here?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-11-05 02:46:30 Re: pg_reload_conf() synchronously
Previous Message Andres Freund 2022-11-05 01:59:46 Re: CI and test improvements