Re: Cached plans and statement generalization

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Cached plans and statement generalization
Date: 2017-05-02 09:50:17
Message-ID: 4bd78cd4-1e64-5294-2eb8-3d061b20288f@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01.05.2017 18:52, Robert Haas wrote:
> On Fri, Apr 28, 2017 at 6:01 AM, Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>> wrote:
>
>
> Any comments and suggestions for future improvement of this patch
> are welcome.
>
>
> + PG_TRY();
> + {
> + query = parse_analyze_varparams(parse_tree,
> + query_string,
> + &param_types,
> + &num_params);
> + }
> + PG_CATCH();
> + {
> + /*
> + * In case of analyze errors revert back to original
> query processing
> + * and disable autoprepare for this query to avoid such
> problems in future.
> + */
> + FlushErrorState();
> + if (snapshot_set) {
> + PopActiveSnapshot();
> + }
> + entry->disable_autoprepare = true;
> + undo_query_plan_changes(parse_tree, const_param_list);
> + MemoryContextSwitchTo(old_context);
> + return false;
> + }
> + PG_END_TRY();
>
> This is definitely not a safe way of using TRY/CATCH.
>
> +
> + /* Convert literal value to parameter value */
> + switch (const_param->literal->val.type)
> + {
> + /*
> + * Convert from integer literal
> + */
> + case T_Integer:
> + switch (ptype) {
> + case INT8OID:
> + params->params[paramno].value =
> Int64GetDatum((int64)const_param->literal->val.val.ival);
> + break;
> + case INT4OID:
> + params->params[paramno].value =
> Int32GetDatum((int32)const_param->literal->val.val.ival);
> + break;
> + case INT2OID:
> + if (const_param->literal->val.val.ival < SHRT_MIN
> + || const_param->literal->val.val.ival > SHRT_MAX)
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
> + errmsg("smallint out of range")));
> + }
> + params->params[paramno].value =
> Int16GetDatum((int16)const_param->literal->val.val.ival);
> + break;
> + case FLOAT4OID:
> + params->params[paramno].value =
> Float4GetDatum((float)const_param->literal->val.val.ival);
> + break;
> + case FLOAT8OID:
> + params->params[paramno].value =
> Float8GetDatum((double)const_param->literal->val.val.ival);
> + break;
> + case INT4RANGEOID:
> + sprintf(buf, "[%ld,%ld]",
> const_param->literal->val.val.ival, const_param->literal->val.val.ival);
> + getTypeInputInfo(ptype, &typinput, &typioparam);
> + params->params[paramno].value =
> OidInputFunctionCall(typinput, buf, typioparam, -1);
> + break;
> + default:
> + pg_lltoa(const_param->literal->val.val.ival, buf);
> + getTypeInputInfo(ptype, &typinput, &typioparam);
> + params->params[paramno].value =
> OidInputFunctionCall(typinput, buf, typioparam, -1);
> + }
> + break;
> + case T_Null:
> + params->params[paramno].isnull = true;
> + break;
> + default:
> + /*
> + * Convert from string literal
> + */
> + getTypeInputInfo(ptype, &typinput, &typioparam);
> + params->params[paramno].value =
> OidInputFunctionCall(typinput, const_param->literal->val.val.str,
> typioparam, -1);
> + }
>
> I don't see something with a bunch of hard-coded rules for particular
> type OIDs having any chance of being acceptable.
>

Well, what I need is to convert literal value represented in Value
struct to parameter datum value.
Struct "value" contains union with integer literal and text.
So this peace of code is just provides efficient handling of most common
cases (integer parameters) and uses type's input function in other cases.

> This patch seems to duplicate a large amount of existing code. That
> would be a good thing to avoid.

Yes, I have to copy a lot of code from exec_parse_message +
exec_bind_message + exec_execute_message functions.
Definitely copying of code is bad flaw. It will be much better and
easier just to call three original functions instead of mixing gathering
their code into the new function.
But I failed to do it because
1. Autoprepare should be integrated into exec_simple_query. Before
executing query in normal way, I need to perform cache lookup for
previously prepared plan for this generalized query.
And generalization of query requires building of query tree (query
parsing). In other words, parsing should be done before I can call
exec_parse_message.
2. exec_bind_message works with parameters passed by client though
libpwq protocol, while autoprepare deals with values of parameters
extracted from literals.
3. I do not want to generate dummy name for autoprepared query to handle
it as normal prepared statement.
And I can not use unnamed statements because I want lifetime of
autoprepared statements will be larger than one statement.
4. I have to use slightly different memory context policy than named or
unnamed prepared statements.

Also this three exec_* functions contain prolog/epilog code which is
needed because them are serving separate libpq requests.
But in case of autoprepared statements them need to be executed in the
context of single libpq message, so most of this code is redundant.

> It could use a visit from the style police and a spell-checker, too.

I will definitely fix style and misspelling - I have not submitted yet
this patch to commit fest and there is long enough time to next commitfest.
My primary intention of publishing this patch is receive feedback on the
proposed approach.
I already got two very useful advices: limit number of cached statements
and pay more attention to safety.
This is why I have reimplemented my original approach with substituting
string literals with parameters without building parse tree.

Right now I am mostly thinking about two problems:
1. Finding out best criteria of detecting literals which need to be
replaced with parameters and which not. It is clear that replacing
"limit 10" with "limit $10"
will have not so much sense and can cause worse execution plan. So right
now I just ignore sort, group by and limit parts. But may be it is
possible to find some more flexible approach.
2. Which type to chose for parameters. I can try to infer type from
context (current solution), or try to use type of literal.
The problem with first approach is that query compiler is not always
able to do it and even if type can be determined, it may be too generic
(for example numeric instead of real
or range instead of integer). The problem with second approach is
opposite: type inferred from literal type can be too restrictive - quite
often integer literals are used to specify values of floating point
constant. The best solution is first try to determine parameter type
from context and then refine it based on literal type. But it will
require repeat of query analysis.
Not sure if it is possible.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Thanks for your feedback.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-05-02 09:58:36 Typos in comments in partition.c
Previous Message Beena Emerson 2017-05-02 09:38:32 Re: multi-column range partition constraint