Re: Backend memory dump analysis

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Backend memory dump analysis
Date: 2018-03-24 17:10:19
Message-ID: 28067.1521911419@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I'll put together a draft patch.

Here's a draft patch for this. Some notes:

* I'm generally pretty happy with the way this turned out. For instance,
you can now tell the difference between index info, partition descriptor,
and RLS policy sub-contexts of CacheMemoryContext; previously they were
all just labeled with the name of their relation, which is useful but not
really enough anymore. I've attached sample MemoryContextStats output
captured near the end of the plpgsql.sql regression test.

* I reverted the addition of the "flags" parameter to the context creation
functions, since it no longer had any use. We could have left it there
for future expansion, but doing so seems like an unnecessary deviation
from the v10 APIs. We can cross that bridge when and if we come to it.

* I'd have liked to get rid of the AllocSetContextCreate wrapper macro
entirely, reverting that to the way it was in v10 as well, but I don't see
any way to do so without giving up the __builtin_constant_p(name) check,
which seems like it'd be a bad idea.

* In some places there's a pstrdup more than there was before (hidden
inside MemoryContextCopySetIdentifier), but I don't think this is really
much more expensive than the previous behavior with MEMCONTEXT_COPY_NAME.
In particular, the fact that having a non-constant identifier no longer
disqualifies a context from participating in aset.c's freelist scheme
probably buys back the overhead. I haven't done any performance testing,
though.

* It seemed worth expending a pstrdup to copy the source string for a
CachedPlan into its context so it could be labeled properly. We can't
just point to the source string in the originating CachedPlanSource
because that might have a different/shorter lifespan. I think in the
long run this would come out to be "free" anyway because someday we're
going to insist on having the source string available at execution for
error-reporting reasons.

* On the other hand, I didn't copy the source string into SPI Plan
contexts. Looking at the sample output, I started to get a bit annoyed
that we have those as separate contexts at all --- at least in the
standard case of one subsidiary CachedPlanSource, seems like we could
combine the contexts. That's fit material for a separate patch, though.

* While I didn't do anything about it here, I think it'd likely be a
good idea for MemoryContextStats printout to truncate the context ID
strings at 100 characters or so. Otherwise, in situations where we
have long queries with cached plans, the output would get unreasonably
bulky. Any thoughts about the exact rule?

* With the preceding idea in mind, I decided to fix things so that the
formatting of MemoryContextStats output is centralized in one place
instead of being repeated in each context module. So there's now a
callback-function API there. This has the additional benefit that people
could write extensions that collect stats output and do $whatever with it,
without relying on anything more than what memnodes.h exposes.

Comments, objections?

regards, tom lane

Attachment Content-Type Size
add-context-identifiers-1.patch text/x-diff 40.9 KB
sample-stats-output.txt text/plain 223.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-03-24 17:19:33 Re: [HACKERS] MERGE SQL Statement for PG11
Previous Message Michael Banck 2018-03-24 14:32:15 Re: [PATCH] Verify Checksums during Basebackups