Re: Changing SQL Inlining Behaviour (or...?)

From: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Adam Brightwell <adam(dot)brightwell(at)crunchydata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Peter Geoghegan <peter(dot)geoghegan(at)crunchydata(dot)com>, Joe Conway <joe(at)crunchydata(dot)com>
Subject: Re: Changing SQL Inlining Behaviour (or...?)
Date: 2019-01-19 20:31:05
Message-ID: 52427AA5-7A1B-4F22-8E0E-CCBC3AABDE82@cleverelephant.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jan 19, 2019, at 12:25 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Adam Brightwell <adam(dot)brightwell(at)crunchydata(dot)com> writes:
>> I've been working with Paul to create and test a patch (attached) that
>> addresses Solution #2. This patch essentially modifies the inlining
>> logic to compare the cost of the function with the total cost of the
>> parameters. The goal as stated above, is that for these kinds of
>> functions, they would be assigned relatively high cost to trigger the
>> inlining case.
>
> I played around with this a bit. I think we could make the rule simpler
> and more useful by expressing it as "if the total incremental cost from
> multiple evaluation of parameters exceeds the inlinable function's
> nominal cost, then don't inline". The existing rule is "no individual
> param expression can cost more than 10*cpu_operator_cost, if the
> function uses it more than once --- never mind how many times exactly".
> With the default cost (100) for a SQL language function, this'd become
> "the total incremental cost can't be more than 100*cpu_operator_cost".
> So that's more or less in the same ballpark, though it will make some
> different decisions in cases near the boundary, and probably will choose
> to inline in more cases than before. But it's simpler to compute and
> more defensible as a criterion, IMO.

Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline?

P

> inlining-solution-3a.patch, attached, does it like that, keeping the
> existing rule that we flat out won't inline if there are any subplans
> in multiply-evaluated parameters. That seems kind of ugly, though,
> so I also prepared inlining-solution-3b.patch which eliminates the
> cruft around subplans in this logic by moving the cruft into
> cost_qual_eval instead: it has to make an arbitrary assumption about
> the cost of a SubLink.
>
> I'm not really sure which of these I like better. The arbitrary cost
> assumption in cost_qual_eval is not great, and it'd be really bad if
> anyone ever tried to use costs for pre-SS_process_sublinks expressions
> for something more important than the case at hand. But on the other
> hand it seems nice to get rid of the arbitrary inlining choice.
>
> Thoughts?
>
> The other thing we need to consider is whether we need any documentation
> adjustments. I believe that right now, the rules for inlining SQL
> functions are not documented anywhere but the code, and maybe it's not
> this patch's job to change that.
>
> regards, tom lane
>
> diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
> index f0ef102..c99304d 100644
> *** a/src/backend/optimizer/util/clauses.c
> --- b/src/backend/optimizer/util/clauses.c
> *************** evaluate_function(Oid funcid, Oid result
> *** 4645,4651 ****
> * unreasonably expensive to evaluate. The expensiveness check not only
> * prevents us from doing multiple evaluations of an expensive parameter
> * at runtime, but is a safety value to limit growth of an expression due
> ! * to repeated inlining.
> *
> * We must also beware of changing the volatility or strictness status of
> * functions by inlining them.
> --- 4645,4653 ----
> * unreasonably expensive to evaluate. The expensiveness check not only
> * prevents us from doing multiple evaluations of an expensive parameter
> * at runtime, but is a safety value to limit growth of an expression due
> ! * to repeated inlining. (Currently, the expensiveness check is expressed
> ! * as a limit on the total additional cost of extra parameter evaluations,
> ! * rather than of any one parameter.)
> *
> * We must also beware of changing the volatility or strictness status of
> * functions by inlining them.
> *************** inline_function(Oid funcid, Oid result_t
> *** 4682,4687 ****
> --- 4684,4690 ----
> Query *querytree;
> Node *newexpr;
> int *usecounts;
> + Cost delta_param_cost;
> ListCell *arg;
> int i;
>
> *************** inline_function(Oid funcid, Oid result_t
> *** 4876,4882 ****
> newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
> args, usecounts);
>
> ! /* Now check for parameter usage */
> i = 0;
> foreach(arg, args)
> {
> --- 4879,4889 ----
> newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
> args, usecounts);
>
> ! /*
> ! * Check for parameter usage, and estimate the delta in evaluation cost
> ! * resulting from parameters possibly being evaluated more than once.
> ! */
> ! delta_param_cost = 0.0;
> i = 0;
> foreach(arg, args)
> {
> *************** inline_function(Oid funcid, Oid result_t
> *** 4887,4922 ****
> /* Param not used at all: uncool if func is strict */
> if (funcform->proisstrict)
> goto fail;
> }
> else if (usecounts[i] != 1)
> {
> ! /* Param used multiple times: uncool if expensive or volatile */
> QualCost eval_cost;
>
> /*
> ! * We define "expensive" as "contains any subplan or more than 10
> ! * operators". Note that the subplan search has to be done
> ! * explicitly, since cost_qual_eval() will barf on unplanned
> ! * subselects.
> */
> if (contain_subplans(param))
> goto fail;
> - cost_qual_eval(&eval_cost, list_make1(param), NULL);
> - if (eval_cost.startup + eval_cost.per_tuple >
> - 10 * cpu_operator_cost)
> - goto fail;
>
> /*
> ! * Check volatility last since this is more expensive than the
> ! * above tests
> */
> ! if (contain_volatile_functions(param))
> ! goto fail;
> }
> i++;
> }
>
> /*
> * Whew --- we can make the substitution. Copy the modified expression
> * out of the temporary memory context, and clean up.
> */
> --- 4894,4949 ----
> /* Param not used at all: uncool if func is strict */
> if (funcform->proisstrict)
> goto fail;
> +
> + /*
> + * In principle, we should credit the eval cost of the param expr
> + * as a savings from inlining. However, it's hard to deal with
> + * subplans properly, so we don't bother. This case is unlikely
> + * anyway, so we just try to be semantically correct.
> + */
> }
> else if (usecounts[i] != 1)
> {
> ! /* Param used multiple times: uncool if volatile */
> QualCost eval_cost;
>
> + if (contain_volatile_functions(param))
> + goto fail;
> +
> /*
> ! * If the param contains any SubLinks, cost_qual_eval() will barf
> ! * on it. Arbitrarily assume that the true cost of a subplan is
> ! * high enough that we should not inline.
> */
> if (contain_subplans(param))
> goto fail;
>
> /*
> ! * Otherwise, estimate the evaluation cost of the param, and
> ! * charge for the N-1 extra evaluations. Note that we consider
> ! * only per-tuple cost not startup cost. Ideally we'd account for
> ! * that somehow, but it's hard since we have as yet no idea how
> ! * many times the query will execute this expression.
> */
> ! cost_qual_eval_node(&eval_cost, param, context->root);
> ! delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1);
> }
> i++;
> }
>
> /*
> + * If the incremental cost of multiple parameter evaluation exceeds the
> + * nominal cost of the function we're inlining, don't inline. With the
> + * default cost for a SQL-language function, this rule causes us to not
> + * inline if multiple evaluation would add more than 100 times
> + * cpu_operator_cost. However, function authors can adjust the claimed
> + * cost of inlinable functions to encourage or discourage inlining in the
> + * presence of multiple parameter usage.
> + */
> + if (delta_param_cost > funcform->procost * cpu_operator_cost)
> + goto fail;
> +
> + /*
> * Whew --- we can make the substitution. Copy the modified expression
> * out of the temporary memory context, and clean up.
> */
> diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
> index 99c5ad9..eb311fa 100644
> *** a/src/backend/optimizer/path/costsize.c
> --- b/src/backend/optimizer/path/costsize.c
> *************** cost_qual_eval_walker(Node *node, cost_q
> *** 3929,3936 ****
> }
> else if (IsA(node, SubLink))
> {
> ! /* This routine should not be applied to un-planned expressions */
> ! elog(ERROR, "cannot handle unplanned sub-select");
> }
> else if (IsA(node, SubPlan))
> {
> --- 3929,3943 ----
> }
> else if (IsA(node, SubLink))
> {
> ! /*
> ! * In normal usage we won't see un-planned SubLinks here, because they
> ! * will previously have been converted to SubPlans or InitPlans.
> ! * However, we can reach here due to inline_function() wishing to
> ! * estimate costs for function parameter expressions. Arbitrarily
> ! * return a fairly large cost to discourage inlining when that would
> ! * cause multiple eval of a subplan.
> ! */
> ! context->total.per_tuple += 100000 * cpu_operator_cost;
> }
> else if (IsA(node, SubPlan))
> {
> diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
> index f0ef102..84fdcc6 100644
> *** a/src/backend/optimizer/util/clauses.c
> --- b/src/backend/optimizer/util/clauses.c
> *************** evaluate_function(Oid funcid, Oid result
> *** 4645,4651 ****
> * unreasonably expensive to evaluate. The expensiveness check not only
> * prevents us from doing multiple evaluations of an expensive parameter
> * at runtime, but is a safety value to limit growth of an expression due
> ! * to repeated inlining.
> *
> * We must also beware of changing the volatility or strictness status of
> * functions by inlining them.
> --- 4645,4653 ----
> * unreasonably expensive to evaluate. The expensiveness check not only
> * prevents us from doing multiple evaluations of an expensive parameter
> * at runtime, but is a safety value to limit growth of an expression due
> ! * to repeated inlining. (Currently, the expensiveness check is expressed
> ! * as a limit on the total additional cost of extra parameter evaluations,
> ! * rather than of any one parameter.)
> *
> * We must also beware of changing the volatility or strictness status of
> * functions by inlining them.
> *************** inline_function(Oid funcid, Oid result_t
> *** 4682,4687 ****
> --- 4684,4690 ----
> Query *querytree;
> Node *newexpr;
> int *usecounts;
> + Cost delta_param_cost;
> ListCell *arg;
> int i;
>
> *************** inline_function(Oid funcid, Oid result_t
> *** 4876,4882 ****
> newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
> args, usecounts);
>
> ! /* Now check for parameter usage */
> i = 0;
> foreach(arg, args)
> {
> --- 4879,4889 ----
> newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
> args, usecounts);
>
> ! /*
> ! * Check for parameter usage, and estimate the delta in evaluation cost
> ! * resulting from parameters possibly being evaluated more than once.
> ! */
> ! delta_param_cost = 0.0;
> i = 0;
> foreach(arg, args)
> {
> *************** inline_function(Oid funcid, Oid result_t
> *** 4888,4922 ****
> if (funcform->proisstrict)
> goto fail;
> }
> ! else if (usecounts[i] != 1)
> {
> ! /* Param used multiple times: uncool if expensive or volatile */
> ! QualCost eval_cost;
> !
> ! /*
> ! * We define "expensive" as "contains any subplan or more than 10
> ! * operators". Note that the subplan search has to be done
> ! * explicitly, since cost_qual_eval() will barf on unplanned
> ! * subselects.
> ! */
> ! if (contain_subplans(param))
> ! goto fail;
> ! cost_qual_eval(&eval_cost, list_make1(param), NULL);
> ! if (eval_cost.startup + eval_cost.per_tuple >
> ! 10 * cpu_operator_cost)
> goto fail;
>
> /*
> ! * Check volatility last since this is more expensive than the
> ! * above tests
> */
> ! if (contain_volatile_functions(param))
> ! goto fail;
> }
> i++;
> }
>
> /*
> * Whew --- we can make the substitution. Copy the modified expression
> * out of the temporary memory context, and clean up.
> */
> --- 4895,4939 ----
> if (funcform->proisstrict)
> goto fail;
> }
> ! else if (usecounts[i] > 1)
> {
> ! /* Param used multiple times: uncool if volatile */
> ! if (contain_volatile_functions(param))
> goto fail;
> + }
>
> + if (usecounts[i] != 1)
> + {
> /*
> ! * Estimate the evaluation cost of the param, and charge for the
> ! * N-1 extra evaluations --- or, if it's not used at all, credit
> ! * the savings. Note that we consider only per-tuple cost, not
> ! * startup cost. Ideally we'd account for that somehow, but it's
> ! * hard since we have as yet no idea how many times the query will
> ! * execute this expression.
> */
> ! QualCost eval_cost;
> !
> ! cost_qual_eval_node(&eval_cost, param, context->root);
> ! delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1);
> }
> +
> i++;
> }
>
> /*
> + * If the incremental cost of multiple parameter evaluation exceeds the
> + * nominal cost of the function we're inlining, don't inline. With the
> + * default cost for a SQL-language function, this rule causes us to not
> + * inline if multiple evaluation would add more than 100 times
> + * cpu_operator_cost. However, function authors can adjust the claimed
> + * cost of inlinable functions to encourage or discourage inlining in the
> + * presence of multiple parameter usage.
> + */
> + if (delta_param_cost > funcform->procost * cpu_operator_cost)
> + goto fail;
> +
> + /*
> * Whew --- we can make the substitution. Copy the modified expression
> * out of the temporary memory context, and clean up.
> */

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-01-19 20:44:09 Re: Changing SQL Inlining Behaviour (or...?)
Previous Message Tom Lane 2019-01-19 20:25:41 Re: Changing SQL Inlining Behaviour (or...?)