Re: Cached plans and statement generalization

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Cached plans and statement generalization
Date: 2017-04-28 10:01:31
Message-ID: 304dfa09-6135-cbcf-d6e5-1ea6c36e6eec@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 26.04.2017 13:46, Pavel Stehule wrote:
>
> I attach new patch which allows to limit the number of
> autoprepared statements (autoprepare_limit GUC variable).
> Also I did more measurements, now with several concurrent
> connections and read-only statements.
> Results of pgbench with 10 connections, scale 10 and read-only
> statements are below:
>
> Protocol
> TPS
> extended
> 87k
> prepared
> 209k
> simple+autoprepare
> 206k
>
>
> As you can see, autoprepare provides more than 2 times speed
> improvement.
>
> Also I tried to measure overhead of parsing (to be able to
> substitute all literals, not only string literals).
> I just added extra call of pg_parse_query. Speed is reduced to 181k.
> So overhead is noticeable, but still making such optimization useful.
> This is why I want to ask question: is it better to implement
> slower but safer and more universal solution?
>
>
> Unsafe solution has not any sense, and it is dangerous (80% of
> database users has not necessary knowledge). If somebody needs the max
> possible performance, then he use explicit prepared statements.
>

I attached new patch to this mail. I completely reimplement my original
approach and now use parse tree transformation.
New pgbench (-S -c 10) results are the following:

Protocol
TPS
extended
87k
prepared
209k
simple+autoprepare
185k

So there is some slowdown comparing with my original implementation and
explicitly prepared statements, but still it provide more than two times
speed-up comparing with unprepared queries. And it doesn't require to
change existed applications.
As far as most of real production application are working with DBMS
through some connection pool (pgbouncer,...), I think that such
optimization will be useful.
Isn't it interesting if If we can increase system throughput almost two
times by just setting one parameter in configuration file?

I also tried to enable autoprepare by default and run regression tests.
7 tests are not passed because of the following reasons:
1. Slightly different error reporting (for example error location is not
always identically specified).
2. Difference in query behavior caused by changed local settings
(Andres gives an example with search_path, and date test is failed
because of changing datestyle).
3. Problems with indirect dependencies (when table is altered only
cached plans directly depending on this relation and invalidated, but
not plans with indirect dependencies).
4. Not performing domain checks for null values.

I do not think that this issues can cause problems for real application.

Also it is possible to limit number of autoprepared statements using
autoprepare_limit parameter, avoid possible backend memory overflow in
case of larger number of unique queries sent by application. LRU
discipline is used to drop least recently used plans.

Any comments and suggestions for future improvement of this patch are
welcome.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
autoprepare.patch text/x-patch 42.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 高增琦 2017-04-28 10:12:15 Re: Dropping a partitioned table takes too long
Previous Message Teodor Sigaev 2017-04-28 09:48:29 Re: convert EXSITS to inner join gotcha and bug