| From: | Paul Ramsey <pramsey(at)cleverelephant(dot)ca> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Adam Brightwell <adam(dot)brightwell(at)crunchydata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Peter Geoghegan <peter(dot)geoghegan(at)crunchydata(dot)com>, Joe Conway <joe(at)crunchydata(dot)com> | 
| Subject: | Re: Changing SQL Inlining Behaviour (or...?) | 
| Date: | 2019-01-19 20:31:05 | 
| Message-ID: | 52427AA5-7A1B-4F22-8E0E-CCBC3AABDE82@cleverelephant.ca | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
> On Jan 19, 2019, at 12:25 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 
> Adam Brightwell <adam(dot)brightwell(at)crunchydata(dot)com> writes:
>> I've been working with Paul to create and test a patch (attached) that
>> addresses Solution #2. This patch essentially modifies the inlining
>> logic to compare the cost of the function with the total cost of the
>> parameters. The goal as stated above, is that for these kinds of
>> functions, they would be assigned relatively high cost to trigger the
>> inlining case.
> 
> I played around with this a bit.  I think we could make the rule simpler
> and more useful by expressing it as "if the total incremental cost from
> multiple evaluation of parameters exceeds the inlinable function's
> nominal cost, then don't inline".  The existing rule is "no individual
> param expression can cost more than 10*cpu_operator_cost, if the
> function uses it more than once --- never mind how many times exactly".
> With the default cost (100) for a SQL language function, this'd become
> "the total incremental cost can't be more than 100*cpu_operator_cost".
> So that's more or less in the same ballpark, though it will make some
> different decisions in cases near the boundary, and probably will choose
> to inline in more cases than before.  But it's simpler to compute and
> more defensible as a criterion, IMO.
Is there a rule of thumb we can use in costing our wrappers to ensure that they always inline?
P
> inlining-solution-3a.patch, attached, does it like that, keeping the
> existing rule that we flat out won't inline if there are any subplans
> in multiply-evaluated parameters.  That seems kind of ugly, though,
> so I also prepared inlining-solution-3b.patch which eliminates the
> cruft around subplans in this logic by moving the cruft into
> cost_qual_eval instead: it has to make an arbitrary assumption about
> the cost of a SubLink.
> 
> I'm not really sure which of these I like better.  The arbitrary cost
> assumption in cost_qual_eval is not great, and it'd be really bad if
> anyone ever tried to use costs for pre-SS_process_sublinks expressions
> for something more important than the case at hand.  But on the other
> hand it seems nice to get rid of the arbitrary inlining choice.
> 
> Thoughts?
> 
> The other thing we need to consider is whether we need any documentation
> adjustments.  I believe that right now, the rules for inlining SQL
> functions are not documented anywhere but the code, and maybe it's not
> this patch's job to change that.
> 
> 			regards, tom lane
> 
> diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
> index f0ef102..c99304d 100644
> *** a/src/backend/optimizer/util/clauses.c
> --- b/src/backend/optimizer/util/clauses.c
> *************** evaluate_function(Oid funcid, Oid result
> *** 4645,4651 ****
>   * unreasonably expensive to evaluate.  The expensiveness check not only
>   * prevents us from doing multiple evaluations of an expensive parameter
>   * at runtime, but is a safety value to limit growth of an expression due
> !  * to repeated inlining.
>   *
>   * We must also beware of changing the volatility or strictness status of
>   * functions by inlining them.
> --- 4645,4653 ----
>   * unreasonably expensive to evaluate.  The expensiveness check not only
>   * prevents us from doing multiple evaluations of an expensive parameter
>   * at runtime, but is a safety value to limit growth of an expression due
> !  * to repeated inlining.  (Currently, the expensiveness check is expressed
> !  * as a limit on the total additional cost of extra parameter evaluations,
> !  * rather than of any one parameter.)
>   *
>   * We must also beware of changing the volatility or strictness status of
>   * functions by inlining them.
> *************** inline_function(Oid funcid, Oid result_t
> *** 4682,4687 ****
> --- 4684,4690 ----
>  	Query	   *querytree;
>  	Node	   *newexpr;
>  	int		   *usecounts;
> + 	Cost		delta_param_cost;
>  	ListCell   *arg;
>  	int			i;
> 
> *************** inline_function(Oid funcid, Oid result_t
> *** 4876,4882 ****
>  	newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
>  										   args, usecounts);
> 
> ! 	/* Now check for parameter usage */
>  	i = 0;
>  	foreach(arg, args)
>  	{
> --- 4879,4889 ----
>  	newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
>  										   args, usecounts);
> 
> ! 	/*
> ! 	 * Check for parameter usage, and estimate the delta in evaluation cost
> ! 	 * resulting from parameters possibly being evaluated more than once.
> ! 	 */
> ! 	delta_param_cost = 0.0;
>  	i = 0;
>  	foreach(arg, args)
>  	{
> *************** inline_function(Oid funcid, Oid result_t
> *** 4887,4922 ****
>  			/* Param not used at all: uncool if func is strict */
>  			if (funcform->proisstrict)
>  				goto fail;
>  		}
>  		else if (usecounts[i] != 1)
>  		{
> ! 			/* Param used multiple times: uncool if expensive or volatile */
>  			QualCost	eval_cost;
> 
>  			/*
> ! 			 * We define "expensive" as "contains any subplan or more than 10
> ! 			 * operators".  Note that the subplan search has to be done
> ! 			 * explicitly, since cost_qual_eval() will barf on unplanned
> ! 			 * subselects.
>  			 */
>  			if (contain_subplans(param))
>  				goto fail;
> - 			cost_qual_eval(&eval_cost, list_make1(param), NULL);
> - 			if (eval_cost.startup + eval_cost.per_tuple >
> - 				10 * cpu_operator_cost)
> - 				goto fail;
> 
>  			/*
> ! 			 * Check volatility last since this is more expensive than the
> ! 			 * above tests
>  			 */
> ! 			if (contain_volatile_functions(param))
> ! 				goto fail;
>  		}
>  		i++;
>  	}
> 
>  	/*
>  	 * Whew --- we can make the substitution.  Copy the modified expression
>  	 * out of the temporary memory context, and clean up.
>  	 */
> --- 4894,4949 ----
>  			/* Param not used at all: uncool if func is strict */
>  			if (funcform->proisstrict)
>  				goto fail;
> + 
> + 			/*
> + 			 * In principle, we should credit the eval cost of the param expr
> + 			 * as a savings from inlining.  However, it's hard to deal with
> + 			 * subplans properly, so we don't bother.  This case is unlikely
> + 			 * anyway, so we just try to be semantically correct.
> + 			 */
>  		}
>  		else if (usecounts[i] != 1)
>  		{
> ! 			/* Param used multiple times: uncool if volatile */
>  			QualCost	eval_cost;
> 
> + 			if (contain_volatile_functions(param))
> + 				goto fail;
> + 
>  			/*
> ! 			 * If the param contains any SubLinks, cost_qual_eval() will barf
> ! 			 * on it.  Arbitrarily assume that the true cost of a subplan is
> ! 			 * high enough that we should not inline.
>  			 */
>  			if (contain_subplans(param))
>  				goto fail;
> 
>  			/*
> ! 			 * Otherwise, estimate the evaluation cost of the param, and
> ! 			 * charge for the N-1 extra evaluations.  Note that we consider
> ! 			 * only per-tuple cost not startup cost.  Ideally we'd account for
> ! 			 * that somehow, but it's hard since we have as yet no idea how
> ! 			 * many times the query will execute this expression.
>  			 */
> ! 			cost_qual_eval_node(&eval_cost, param, context->root);
> ! 			delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1);
>  		}
>  		i++;
>  	}
> 
>  	/*
> + 	 * If the incremental cost of multiple parameter evaluation exceeds the
> + 	 * nominal cost of the function we're inlining, don't inline.  With the
> + 	 * default cost for a SQL-language function, this rule causes us to not
> + 	 * inline if multiple evaluation would add more than 100 times
> + 	 * cpu_operator_cost.  However, function authors can adjust the claimed
> + 	 * cost of inlinable functions to encourage or discourage inlining in the
> + 	 * presence of multiple parameter usage.
> + 	 */
> + 	if (delta_param_cost > funcform->procost * cpu_operator_cost)
> + 		goto fail;
> + 
> + 	/*
>  	 * Whew --- we can make the substitution.  Copy the modified expression
>  	 * out of the temporary memory context, and clean up.
>  	 */
> diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
> index 99c5ad9..eb311fa 100644
> *** a/src/backend/optimizer/path/costsize.c
> --- b/src/backend/optimizer/path/costsize.c
> *************** cost_qual_eval_walker(Node *node, cost_q
> *** 3929,3936 ****
>  	}
>  	else if (IsA(node, SubLink))
>  	{
> ! 		/* This routine should not be applied to un-planned expressions */
> ! 		elog(ERROR, "cannot handle unplanned sub-select");
>  	}
>  	else if (IsA(node, SubPlan))
>  	{
> --- 3929,3943 ----
>  	}
>  	else if (IsA(node, SubLink))
>  	{
> ! 		/*
> ! 		 * In normal usage we won't see un-planned SubLinks here, because they
> ! 		 * will previously have been converted to SubPlans or InitPlans.
> ! 		 * However, we can reach here due to inline_function() wishing to
> ! 		 * estimate costs for function parameter expressions.  Arbitrarily
> ! 		 * return a fairly large cost to discourage inlining when that would
> ! 		 * cause multiple eval of a subplan.
> ! 		 */
> ! 		context->total.per_tuple += 100000 * cpu_operator_cost;
>  	}
>  	else if (IsA(node, SubPlan))
>  	{
> diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
> index f0ef102..84fdcc6 100644
> *** a/src/backend/optimizer/util/clauses.c
> --- b/src/backend/optimizer/util/clauses.c
> *************** evaluate_function(Oid funcid, Oid result
> *** 4645,4651 ****
>   * unreasonably expensive to evaluate.  The expensiveness check not only
>   * prevents us from doing multiple evaluations of an expensive parameter
>   * at runtime, but is a safety value to limit growth of an expression due
> !  * to repeated inlining.
>   *
>   * We must also beware of changing the volatility or strictness status of
>   * functions by inlining them.
> --- 4645,4653 ----
>   * unreasonably expensive to evaluate.  The expensiveness check not only
>   * prevents us from doing multiple evaluations of an expensive parameter
>   * at runtime, but is a safety value to limit growth of an expression due
> !  * to repeated inlining.  (Currently, the expensiveness check is expressed
> !  * as a limit on the total additional cost of extra parameter evaluations,
> !  * rather than of any one parameter.)
>   *
>   * We must also beware of changing the volatility or strictness status of
>   * functions by inlining them.
> *************** inline_function(Oid funcid, Oid result_t
> *** 4682,4687 ****
> --- 4684,4690 ----
>  	Query	   *querytree;
>  	Node	   *newexpr;
>  	int		   *usecounts;
> + 	Cost		delta_param_cost;
>  	ListCell   *arg;
>  	int			i;
> 
> *************** inline_function(Oid funcid, Oid result_t
> *** 4876,4882 ****
>  	newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
>  										   args, usecounts);
> 
> ! 	/* Now check for parameter usage */
>  	i = 0;
>  	foreach(arg, args)
>  	{
> --- 4879,4889 ----
>  	newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
>  										   args, usecounts);
> 
> ! 	/*
> ! 	 * Check for parameter usage, and estimate the delta in evaluation cost
> ! 	 * resulting from parameters possibly being evaluated more than once.
> ! 	 */
> ! 	delta_param_cost = 0.0;
>  	i = 0;
>  	foreach(arg, args)
>  	{
> *************** inline_function(Oid funcid, Oid result_t
> *** 4888,4922 ****
>  			if (funcform->proisstrict)
>  				goto fail;
>  		}
> ! 		else if (usecounts[i] != 1)
>  		{
> ! 			/* Param used multiple times: uncool if expensive or volatile */
> ! 			QualCost	eval_cost;
> ! 
> ! 			/*
> ! 			 * We define "expensive" as "contains any subplan or more than 10
> ! 			 * operators".  Note that the subplan search has to be done
> ! 			 * explicitly, since cost_qual_eval() will barf on unplanned
> ! 			 * subselects.
> ! 			 */
> ! 			if (contain_subplans(param))
> ! 				goto fail;
> ! 			cost_qual_eval(&eval_cost, list_make1(param), NULL);
> ! 			if (eval_cost.startup + eval_cost.per_tuple >
> ! 				10 * cpu_operator_cost)
>  				goto fail;
> 
>  			/*
> ! 			 * Check volatility last since this is more expensive than the
> ! 			 * above tests
>  			 */
> ! 			if (contain_volatile_functions(param))
> ! 				goto fail;
>  		}
>  		i++;
>  	}
> 
>  	/*
>  	 * Whew --- we can make the substitution.  Copy the modified expression
>  	 * out of the temporary memory context, and clean up.
>  	 */
> --- 4895,4939 ----
>  			if (funcform->proisstrict)
>  				goto fail;
>  		}
> ! 		else if (usecounts[i] > 1)
>  		{
> ! 			/* Param used multiple times: uncool if volatile */
> ! 			if (contain_volatile_functions(param))
>  				goto fail;
> + 		}
> 
> + 		if (usecounts[i] != 1)
> + 		{
>  			/*
> ! 			 * Estimate the evaluation cost of the param, and charge for the
> ! 			 * N-1 extra evaluations --- or, if it's not used at all, credit
> ! 			 * the savings.  Note that we consider only per-tuple cost, not
> ! 			 * startup cost.  Ideally we'd account for that somehow, but it's
> ! 			 * hard since we have as yet no idea how many times the query will
> ! 			 * execute this expression.
>  			 */
> ! 			QualCost	eval_cost;
> ! 
> ! 			cost_qual_eval_node(&eval_cost, param, context->root);
> ! 			delta_param_cost += eval_cost.per_tuple * (usecounts[i] - 1);
>  		}
> + 
>  		i++;
>  	}
> 
>  	/*
> + 	 * If the incremental cost of multiple parameter evaluation exceeds the
> + 	 * nominal cost of the function we're inlining, don't inline.  With the
> + 	 * default cost for a SQL-language function, this rule causes us to not
> + 	 * inline if multiple evaluation would add more than 100 times
> + 	 * cpu_operator_cost.  However, function authors can adjust the claimed
> + 	 * cost of inlinable functions to encourage or discourage inlining in the
> + 	 * presence of multiple parameter usage.
> + 	 */
> + 	if (delta_param_cost > funcform->procost * cpu_operator_cost)
> + 		goto fail;
> + 
> + 	/*
>  	 * Whew --- we can make the substitution.  Copy the modified expression
>  	 * out of the temporary memory context, and clean up.
>  	 */
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2019-01-19 20:44:09 | Re: Changing SQL Inlining Behaviour (or...?) | 
| Previous Message | Tom Lane | 2019-01-19 20:25:41 | Re: Changing SQL Inlining Behaviour (or...?) |