Re: Adding column "mem_usage" to view pg_prepared_statements

From: Andres Freund <andres(at)anarazel(dot)de>
To: Daniel Migowski <dmigowski(at)ikoffice(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding column "mem_usage" to view pg_prepared_statements
Date: 2019-08-05 21:03:44
Message-ID: 20190805210344.n2opqjdqy6xf4dte@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-08-05 22:46:47 +0200, Daniel Migowski wrote:
> > Arguably the proposed owning_object field would be a bit redundant with
> > the already existing ident/MemoryContextSetIdentifier field, which
> > e.g. already associates the query string with the contexts used for a
> > prepared statement. But I'm not convinced that's going to be enough
> > context in a lot of cases, because e.g. for prepared statements it could
> > be interesting to have access to both the prepared statement name, and
> > the statement.
> The identifier seems to be more like a category at the moment, because it
> does not seem to hold any relevant information about the object in question.
> So a more specific name would be nice.

I think you might be thinking of the context's name, not ident? E.g. for
prepared statements the context name is:

source_context = AllocSetContextCreate(CurrentMemoryContext,
"CachedPlanSource",
ALLOCSET_START_SMALL_SIZES);

which is obviously the same for every statement. But then there's

MemoryContextSetIdentifier(source_context, plansource->query_string);

which obviously differs.

> > The reason I like something like this is that we wouldn't add new
> > columns to a number of views, and lack views to associate such
> > information to for some objects. And it'd be disproportional to add all
> > the information to numerous places anyway.
> I understand your argumentation, but things like Cursors and Portals are
> rather short living while prepared statements seem to be the place where
> memory really builds up.

That's not necessarily true, especially given WITH HOLD cursors. Nor
does one only run out of memory in the context of long-lived objects.

> > > While being interesting I still believe monitoring the mem usage of
> > > prepared statements is a bit more important than that of other objects
> > > because of how they change memory consumption of the server without
> > > using any DDL or configuration options and I am not aware of other
> > > objects with the same properties, or are there some? And for the other
> > > volatile objects like tables and indexes and their contents PostgreSQL
> > > already has it's information functions.

> > Plenty other objects have that property. E.g. cursors. And for the
> > catalog/relation/... caches it's even more pernicious - the client might
> > have closed all its "handles", but we still use memory (and it's
> > absolutely crucial for performance).
>
> Maybe we can do both? Add a single column to pg_prepared_statements, and add
> another table for the output of MemoryContextStatsDetail? This has the
> advantage that the single real memory indicator useful for end users (to the
> question: How much mem takes my sh*t up?) is in pg_prepared_statements and
> some more intrinsic information in a detail view.

I don't see why we'd want to do both. Just makes pg_prepared_statements
a considerably more expensive. And that's used by some applications /
clients in an automated manner.

> Thinking about the latter I am against such a table, at least in the form
> where it gives information like context_total_freechunks, because it would
> just be useful for us developers.

Developers are also an audience for us. I mean we certainly can use this
information during development. But even for bugreports such information
would be useufl.

> Why should any end user care for how many
> chunks are still open in a MemoryContext, except when he is working on
> C-style extensions. Could just be a source of confusion for them.

Meh. As long as the crucial stuff is first, that's imo enough.

> Let's think about the goal this should have: The end user should be able to
> monitor the memory consumption of things he's in control of or could affect
> the system performance. Should such a table automatically aggregate some
> information? I think so. I would not add more than two memory columns to the
> view, just mem_used and mem_reserved. And even mem_used is questionable,
> because in his eyes only the memory he cannot use for other stuff because of
> object x is important for him (that was the reason I just added one column).
> He would even ask: WHY is there 50% more memory reserved than used, and how
> I can optimize it? (Would lead to more curious PostgreSQL developers maybe,
> so that's maybe a plus).

It's important because it influences how memory usage will grow.

> On the other hand: The Generic Plan had been created for the first
> invocation of the prepared statement, why not store it immediatly. It is a
> named statement for a reason that it is intended to be reused, even when it
> is just twice, and since memory seems not to be seen as a scarce resource in
> this context why not store that immediately. Would drop the need for a
> hierarchy here also.

Well, we'll maybe never use it, so ...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-08-05 21:10:26 Re: Redacting information from logs
Previous Message Daniel Migowski 2019-08-05 20:46:47 Re: Adding column "mem_usage" to view pg_prepared_statements