Re: Add the ability to limit the amount of memory that can be allocated to backends.

From: Reid Thompson <reid(dot)thompson(at)crunchydata(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: reid(dot)thompson(at)crunchydata(dot)com, vignesh C <vignesh21(at)gmail(dot)com>, Arne Roland <A(dot)Roland(at)index(dot)de>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
Subject: Re: Add the ability to limit the amount of memory that can be allocated to backends.
Date: 2023-01-13 14:15:10
Message-ID: c5f84d8f22542d92ef640b2619e62eb5de56cb5b.camel@crunchydata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2023-01-09 at 18:31 -0800, Andres Freund wrote:
> Hi,
>
> On 2023-01-05 13:44:20 -0500, Reid Thompson wrote:
> > This new field displays the current bytes of memory allocated to the
> > backend process. It is updated as memory for the process is
> > malloc'd/free'd. Memory allocated to items on the freelist is included in
> > the displayed value.
>
> It doesn't actually malloc/free. It tracks palloc/pfree.

I will update the message

>
> > Dynamic shared memory allocations are included only in the value displayed
> > for the backend that created them, they are not included in the value for
> > backends that are attached to them to avoid double counting.
>
> As mentioned before, I don't think accounting DSM this way makes sense.

Understood, previously you noted 'There are a few uses of DSMs that track
shared resources, with the biggest likely being the stats for relations
etc'. I'd like to come up with a solution to address this; identifying the
long term allocations to shared state and accounting for them such that they
don't get 'lost' when the allocating backend exits. Any guidance or
direction would be appreciated.

> > --- a/src/backend/postmaster/autovacuum.c
> > +++ b/src/backend/postmaster/autovacuum.c
> > @@ -407,6 +407,9 @@ StartAutoVacLauncher(void)
> >
> > #ifndef EXEC_BACKEND
> > case 0:
> > + /* Zero allocated bytes to avoid double counting parent allocation */
> > + pgstat_zero_my_allocated_bytes();
> > +
> > /* in postmaster child ... */
> > InitPostmasterChild();
>
>
>
> > @@ -1485,6 +1488,9 @@ StartAutoVacWorker(void)
> >
> > #ifndef EXEC_BACKEND
> > case 0:
> > + /* Zero allocated bytes to avoid double counting parent allocation */
> > + pgstat_zero_my_allocated_bytes();
> > +
> > /* in postmaster child ... */
> > InitPostmasterChild();
> >
> > diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
> > index eac3450774..24278e5c18 100644
> > --- a/src/backend/postmaster/postmaster.c
> > +++ b/src/backend/postmaster/postmaster.c
> > @@ -4102,6 +4102,9 @@ BackendStartup(Port *port)
> > {
> > free(bn);
> >
> > + /* Zero allocated bytes to avoid double counting parent allocation */
> > + pgstat_zero_my_allocated_bytes();
> > +
> > /* Detangle from postmaster */
> > InitPostmasterChild();
>
>
> It doesn't at all seem right to call pgstat_zero_my_allocated_bytes() here,
> before even InitPostmasterChild() is called. Nor does it seem right to add the
> call to so many places.
>
> Note that this is before we even delete postmaster's memory, see e.g.:
> /*
> * If the PostmasterContext is still around, recycle the space; we don't
> * need it anymore after InitPostgres completes. Note this does not trash
> * *MyProcPort, because ConnCreate() allocated that space with malloc()
> * ... else we'd need to copy the Port data first. Also, subsidiary data
> * such as the username isn't lost either; see ProcessStartupPacket().
> */
> if (PostmasterContext)
> {
> MemoryContextDelete(PostmasterContext);
> PostmasterContext = NULL;
> }
>
> calling pgstat_zero_my_allocated_bytes() before we do this will lead to
> undercounting memory usage, afaict.
>

OK - I'll trace back through these and see if I can better locate and reduce the
number of invocations.

> > +/* Enum helper for reporting memory allocated bytes */
> > +enum allocation_direction
> > +{
> > + PG_ALLOC_DECREASE = -1,
> > + PG_ALLOC_IGNORE,
> > + PG_ALLOC_INCREASE,
> > +};
>
> What's the point of this?
>
>
> > +/* ----------
> > + * pgstat_report_allocated_bytes() -
> > + *
> > + * Called to report change in memory allocated for this backend.
> > + *
> > + * my_allocated_bytes initially points to local memory, making it safe to call
> > + * this before pgstats has been initialized. allocation_direction is a
> > + * positive/negative multiplier enum defined above.
> > + * ----------
> > + */
> > +static inline void
> > +pgstat_report_allocated_bytes(int64 allocated_bytes, int allocation_direction)
>
> I don't think this should take allocation_direction as a parameter, I'd make
> it two different functions.

Originally it was two functions, a suggestion was made in the thread to
maybe consolidate them to a single function with a direction indicator,
hence the above. I'm fine with converting it back to separate functions.

>
> > + if (allocation_direction == PG_ALLOC_DECREASE &&
> > + pg_sub_u64_overflow(*my_allocated_bytes, allocated_bytes, &temp))
> > + {
> > + *my_allocated_bytes = 0;
> > + ereport(LOG,
> > + errmsg("Backend %d deallocated %lld bytes, exceeding the %llu bytes it is currently reporting allocated. Setting reported to 0.",
> > + MyProcPid, (long long) allocated_bytes,
> > + (unsigned long long) *my_allocated_bytes));
>
> We certainly shouldn't have an ereport in here. This stuff really needs to be
> cheap.

I will remove the ereport.

>
> > + *my_allocated_bytes += (allocated_bytes) * allocation_direction;
>
> Superfluous parens?

I will remove these.

>
>
> > +/* ----------
> > + * pgstat_get_all_memory_allocated() -
> > + *
> > + * Return a uint64 representing the current shared memory allocated to all
> > + * backends. This looks directly at the BackendStatusArray, and so will
> > + * provide current information regardless of the age of our transaction's
> > + * snapshot of the status array.
> > + * In the future we will likely utilize additional values - perhaps limit
> > + * backend allocation by user/role, etc.
> > + * ----------
>
> I think it's completely unfeasible to execute something as expensive as
> pgstat_get_all_backend_memory_allocated() on every allocation. Like,
> seriously, no.

Ok. Do we check every nth allocation/try to implement a scheme of checking
more often as we we get closer to the declared max_total_bkend_mem?

>
> And we absolutely definitely shouldn't just add CHECK_FOR_INTERRUPT() calls
> into the middle of allocator code.

I'm open to guidance/suggestions/pointers to remedying these.

> Greetings,
>
> Andres Freund
>

Thanks,
Reid

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhang Mingli 2023-01-13 14:18:09 Re: [feature]COPY FROM enable FORCE_NULL/FORCE_NOT_NULL on all columns
Previous Message Justin Pryzby 2023-01-13 14:02:21 Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl