Re: [HACKERS] Cached plans and statement generalization

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Cached plans and statement generalization
Date: 2019-07-31 19:48:16
Message-ID: 5538dbd6-fcbb-33a7-49d8-28c5cd2df435@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 31.07.2019 19:56, Heikki Linnakangas wrote:
> On 09/07/2019 23:59, Konstantin Knizhnik wrote:
>> Fixed patch version of the path is attached.
>
> Much of the patch and the discussion has been around the raw parsing,
> and guessing which literals are actually parameters that have been
> inlined into the SQL text. Do we really need to do that, though? The
> documentation mentions pgbouncer and other connection poolers, where
> you don't have session state, as a use case for this. But such
> connection poolers could and should still be using the extended query
> protocol, with Parse+Bind messages and query parameters, even if they
> don't use named prepared statements. I'd want to encourage
> applications and middleware to use out-of-band query parameters, to
> avoid SQL injection attacks, regardless of whether they use prepared
> statements or cache query plans. So how about dropping all the raw
> parse tree stuff, and doing the automatic caching of plans based on
> the SQL string, somewhere in the exec_parse_message? Check the
> autoprepare cache in exec_parse_message(), if it was an "unnamed"
> prepared statement, i.e. if the prepared statement name is an empty
> string.
>
> (I'm actually not sure if exec_parse/bind_message is the right place
> for this, but I saw that your current patch did it in
> exec_simple_query, and exec_parse/bind_message are the equivalent of
> that for the extended query protocol).

It will significantly simplify this patch and eliminate all complexity
and troubles  caused by replacing string literals with parameters
if assumption that all client applications are using extended query
protocol is true.
But I am not sure that we can expect it. At least I myself see many
applications which are constructing queries by embedding literal values.
May be it is not so good and safe (SQL injection attack), but many
applications are doing it. And the idea was to improve execution speed
of existed application without changing them.

Also please notice that extended protocol requires passing more message
which has negative effect on performance.
At my  notebook I get about 21k TPS on "pgbench -S" and 18k TPS on
"pgbench -M extended -S".
And it is with unix socket connection! I think that in case of remote
connections difference will be even larger.

So may be committing simple version of this patch which do not need to
solve any challenged problems is good idea.
But I afraid that it will significantly reduce positive effect of this
patch.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-07-31 20:03:01 Re: block-level incremental backup
Previous Message Ashwin Agrawal 2019-07-31 19:37:58 Re: Remove HeapTuple and Buffer dependency for predicate locking functions