Re: Cached plans and statement generalization

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Cached plans and statement generalization
Date: 2017-05-12 07:50:41
Message-ID: 857b4487-5f44-07eb-0aa7-aa152ee8ff98@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12.05.2017 03:58, Bruce Momjian wrote:
> On Thu, May 11, 2017 at 10:41:45PM +0300, Konstantin Knizhnik wrote:
>> This is why I have provided second implementation which replace
>> literals with parameters after raw parsing. Certainly it is slower
>> than first approach. But still provide significant advantage in
>> performance: more than two times at pgbench. Then I tried to run
>> regression tests and find several situations where type analysis is
>> not correctly performed in case of replacing literals with parameters.
> So the issue is that per-command output from the parser, SelectStmt,
> only has strings for identifers, e.g. table and column names, so you
> can't be sure it is the same as the cached entry you matched. I suppose
> if you cleared the cache every time someone created an object or changed
> search_path, it might work.
>
Definitely changing session context (search_path, date/time format, ...)
may cause incorrect behavior of cached statements.
Actually you may get the same problem with explicitly prepared
statements (certainly, in the last case, you better understand what
going on and it is your choice whether to use or not to use prepared
statement).

The fact of failure of 7 regression tests means that autoprepare can
really change behavior of existed program. This is why my suggestion is
to switch off this feature by default.
But in 99.9% real cases (my estimation plucked out of thin air:) there
will be no such problems with autoprepare. And it can significantly
improve performance of OLTP applications
which are not able to use prepared statements (because of working
through pgbouncer or any other reasons).

Can autoprepare slow down the system?
Yes, it can. It can happen if application perform larger number of
unique queries and autoprepare cache size is not limited.
In this case large (and infinitely growing) number of stored plans can
consume a lot of memory and, what is even worse, slowdown cache lookup.
This is why I by default limit number of cached statements
(autoprepare_limit parameter) by 100.

I am almost sure that there will be some other issues with autoprepare
which I have not encountered yet (because I mostly tested it on pgbench
and Postgres regression tests).
But I am also sure that benefit of doubling system performance is good
motivation to continue work in this direction.

My main concern is whether to continue to improve current approach with
local (per-backend) cache of prepared statements.
Or create shared cache (as in Oracle). It is much more difficult to
implement shared cache (the same problem with session context, different
catalog snapshots, cache invalidation,...)
but it also provides more opportunities for queries optimization and
tuning.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-05-12 08:16:24 Re: [POC] hash partitioning
Previous Message Amit Langote 2017-05-12 07:26:19 Re: multi-column range partition constraint