Quick Links

Re: Adding column "mem_usage" to view pg_prepared_statements

From:	Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To:	Daniel Migowski <dmigowski(at)ikoffice(dot)de>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Adding column "mem_usage" to view pg_prepared_statements
Date:	2019-08-06 07:48:31
Message-ID:	6cccc75b-2c3d-7016-710a-5119f4092d9b@postgrespro.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 05.08.2019 22:35, Daniel Migowski wrote:
> .
>> I think that including in pg_prepared_statements information about
>> memory used this statement is very useful.
>> CachedPlanMemoryUsage function may be useful not only for this view,
>> but for example it is also need in my autoprepare patch.
> I would love to use your work if it's done, and would also love to
> work together here. I am quite novice in C thought, I might take my
> time to get things right.

Right now I resused your implementation of CachedPlanMemoryUsage function:)
Before I took in account only memory used by plan->context, but not of
plan->query_context and plan->gplan->context (although query_context for
raw parse tree seems to be much smaller).

>> I wonder if you consider go further and not only report but control
>> memory used by prepared statements?
>> For example implement some LRU replacement discipline on top of
>> prepared statements cache which can
>> evict rarely used prepared statements to avoid memory overflow.
>
> THIS! Having some kind of safety net here would finally make sure that
> my precious processes will not grow endlessly until all mem is eaten
> up, even with prep statement count limits.
>
> While working on stuff I noticed there are three things stored in a
> CachedPlanSource. The raw query tree (a relatively small thing), the
> query list (analyzed-and-rewritten query tree) which takes up the most
> memory (at least here, maybe different with your usecases), and (often
> after the 6th call) the CachedPlan, which is more optimized that the
> query list and often needs less memory (half of the query list here).
>
> The query list seems to take the most time to create here, because I
> hit the GEQO engine here, but it could be recreated easily (up to
> 500ms for some queries). Creating the CachedPlan afterwards takes 60ms
> in some usecase. IF we could just invalidate them from time to time,
> that would be grate.
>
> Also, invalidating just the queries or the CachedPlan would not
> invalidate the whole prepared statement, which would break clients
> expectations, but just make them a slower, adding much to the
> stability of the system. I would pay that price, because I just don't
> use manually named prepared statements anyway and just autogenerate
> them as performance sugar without thinking about what really needs to
> be prepared anyway. There is an option in the JDBC driver to use
> prepared statements automatically after you have used them a few time.

I have noticed that cached plans for implicitly prepared statements in
stored procedures are not shown in pg_prepared_statements view.
It may be not a problem in your case (if you are accessing Postgres
through JDBC and not using prepared statements),
but can cause memory overflow in applications actively using stored
procedures, because unlike explicitly created prepared statements, it is
very difficult
to estimate and control statements implicitly prepared by plpgsql.

I am not sure what will be the best solution in this case. Adding yet
another view for implicitly prepared statements? Or include them in
pg_prepared_statements view?
>
>> We have such patch for PgPro-EE but it limits only number of prepared
>> statement, not taken in account amount of memory used by them.
>> I think that memory based limit will be more accurate (although it
>> adds more overhead).
>
> Limiting them by number is already done automatically here and would
> really not be of much value, but having a mem limit would be great. We
> could have a combined memory limit for your autoprepared statements as
> well as the manually prepared ones, so clients can know for sure the
> server processes won't eat up more that e.g. 800MB for prepared
> statements. And also I would like to have this value spread across all
> client processes, e.g. specifying max_prepared_statement_total_mem=5G
> for the server, and maybe max_prepared_statement_mem=1G for client
> processes. Of course we would have to implement cross client process
> invalidation here, and I don't know if communicating client processes
> are even intended.
>
> Anyway, a memory limit won't really add that much more overhead. At
> least not more than having no prepared statements at all because of
> the fear of server OOMs, or have just a small count of those
> statements. I was even think about a prepared statement reaper that
> checks the pg_prepared_statements every few minutes to clean things up
> manually, but having this in the server would be of great value to me.

Right now memory context has no field containing amount of currently
used memory.
This is why context->methods->stats function implementation has to
traverse all blocks to calculate size of memory used by context.
It may be not so fast for large contexts. But I do not expect that
contexts of prepared statements will be very large, although
I have deal with customers which issued queries with query text length
larger than few megabytes. I afraid to estimate size of plan for such
queries.
This is the reason of my concern that calculating memory context size
may have negative effect on performance. But is has to be done only once
when statement is prepared. So may be it is not a problem at all.

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Re: Adding column "mem_usage" to view pg_prepared_statements at 2019-08-05 19:35:13 from Daniel Migowski

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2019-08-06 07:56:26	Re: POC: Cleaning up orphaned files using undo logs
Previous Message	Andres Freund	2019-08-06 07:11:53	Re: tableam vs. TOAST