Re: JIT compilation per plan node

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: JIT compilation per plan node
Date: 2024-02-20 10:31:12
Message-ID: dbbb5688-4001-4244-94a1-bf803af763bd@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/20/24 06:14, David Rowley wrote:
> On Tue, 20 Feb 2024 at 05:26, Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>> I doubt CreatePlanContext is a great way to achieve this. For one, it
>> breaks the long-standing custom that PlannerInfo is the first parameter,
>> usually followed by RelOptInfo, etc. CreatePlanContext is added to some
>> functions (but not all), which makes it ... unpredictable.
>
> I suggested this to Melih as a way to do this based on what Andy wrote
> in [1]. I agree with Andy that it's not great to add est_calls to
> every function in createplan.c. I feel that CreatePlanContext is a way
> to take the hit of adding a parameter once and hopefully never having
> to do it again to this degree. I wondered if PlannerInfo could be a
> field in CreatePlanContext.
>

You mean we'd be adding more parameters to CreatePlanContext in the
future? Not sure that's a great idea, there are reasons why we have
separate arguments in function signature and not a single struct.

I think contexts are nice to group attributes with a particular purpose,
not as a replacement for arguments to make function signatures simpler.

>> FWIW it's not clear to me if/how this solves the problem with early
>> create_plan() calls for subplans. Or is it still the same?
>>
>> Wouldn't it be simpler to just build the plan as we do now, and then
>> have an expression_tree_walker that walks the complete plan top-down,
>> inspects the nodes, enables JIT where appropriate and so on? That can
>> have arbitrary context, no problem with that.
>
> Why walk the entire plan tree again to do something we could do when
> building it in the first place? Recursively walking trees isn't great
> from a performance point of view. It would be nice to avoid this if we
> can find some other way to handle subplans. I do have a few other
> reasons up my sleeve that subplan creation should be delayed until
> later, so maybe we should fix that to unblock those issues.
>
>> Considering we decide JIT pretty late anyway (long after costing and
>> other stuff that might affect the plan selection), the result should be
>> exactly the same, without the extensive createplan.c disruption ...
>>
>> (usual caveat: I haven't tried, maybe there's something that means this
>> can't work)
>
> It's not like we can look at the top-node's cost as a pre-check to
> skip the recursive step for cheap plans as it's perfectly valid that a
> node closer to the root of the plan tree have a lower total cost than
> that node's subnodes. e.g LIMIT 1.
>

I'd argue that's actually a reason to do the precheck, exactly because
of the LIMIT. The fact that some node has high total cost does not
matter if there is LIMIT 1 above it. What matters is which fraction of
the plan we execute, not the total cost.

Imagine you have something like

-> Limit 1 (cost=0..1 rows=1 ...)
-> Seqscan (cost=0..100000000 rows=1000000 ...)

I'd argue JIT-ing the seqscan is likely pointless, because on average
we'll execute ~1/1000000 of the scan, and the actual cost will be ~100.

>>> - The number of estimated calls are especially useful where a node is
>>> expected to be rescanned, such as Gather. Gather Merge, Memoize and Nested
>>> Loop. Knowing the estimated number of calls for a node allows us to rely on
>>> total cost multiplied by the estimated calls instead of only total cost for
>>> the node.
>>
>> Not sure I follow. Why would be these nodes (Gather, Gather Merge, ...)
>> more likely to be rescanned compared to other nodes?
>
> I think Melih is listing nodes that can change the est_calls. Any
> node can be rescanned, but only a subset of nodes can adjust the
> number of times they call their subnode vs how many times they
> themselves are called.
>

OK

>>> - For each node, the planner considers if the node should be JITed. If the
>>> cost of the node * the number of estimated calls is greater than
>>> jit_above_cost, it's decided to be JIT compiled. Note that this changes
>>> the meaning of jit_above_cost, it's now a threshold for a single plan node
>>> and not the whole query. Additionally, this change in JIT consideration is
>>> only for JIT compilations. Inlining and optimizations continue to be for
>>> the whole query and based on the overall cost of the query.
>>
>> It's not clear to me why JIT compilation is decided for each node, while
>> the inlining/optimization is decided for the plan as a whole. I'm not
>> familiar with the JIT stuff, so maybe it's obvious to others ...
>
> This is a problem with LLVM, IIRC. The problem is it's a decision
> that has to be made for an entire compilation unit and it can't be
> decided at the expression level. This is pretty annoying as it's
> pretty hard to decide the best logic to use to enable optimisations
> and inlining :-(
>
> I think the best thing we could discuss right now is, is this the best
> way to fix the JIT costing problem. In [2] I did link to a complaint
> about the JIT costings. See [3]. The OP there wanted to keep the plan
> private, but I did get to see it and described the problem on the
> list.
>
> Also, I don't happen to think the decision about JITting per plan node
> is perfect as the node's total costs can be high for reasons other
> than the cost of evaluating expressions. Also, the number of times a
> given expression is evaluated can vary quite a bit based on when the
> expression is evaluated. For example, a foreign table scan that does
> most of the filtering remotely, but has a non-shippable expr that
> needs to be evaluated locally. The foreign scan might be very
> expensive, especially if lots of filtering is done by a Seq Scan and
> not many rows might make it back to the local server to benefit from
> JITting the non-shippable expression.
>
> A counter-example is the join condition of a non-parameterized nested
> loop. Those get evaluated n_outer_rows * n_inner_rows times.
>
> I think the savings JIT gives us on evaluation of expressions is going
> to be more closely tied to the number of times an expression is
> evaluated than the total cost of the node. However, it's likely more
> complex for optimisations and inlining as I imagine the size and
> complexity of the comparison function matters too.
>
> It would be good to all agree on how we're going to fix this problem
> exactly before Melih gets in too deep fixing the finer details of the
> patch. If anyone isn't convinced enough there's a problem with the JIT
> costings then I can see if I can dig up other threads where this is
> being complained about.
>
> Does anyone want to disagree with the general idea of making the
> compilation decision based on the total cost of the node? Or have a
> better idea?
>

I certainly agree that the current JIT costing is quite crude, and we've
all seen cases where the decision turns out to not be great. And I think
the plan to make the decisions at the node level makes sense, so +1 to
that in general.

And I think you're right that looking just at the node total cost may
not be sufficient - that we may need a better cost model, considering
how many times an expression is executed and so on. But I think we
should try to do this in smaller steps, meaningful on their own,
otherwise we won't move at all. The two threads linked by Melih are ~4y
old and *nothing* changed since then, AFAIK.

I think it's reasonable to start by moving the decision to the node
level - it's where the JIT happens, anyway. It may not be perfect, but
it seems like a clear improvement. And if we then choose to improve the
"JIT cost model" to address some of the issues you pointed out, surely
that would need to happen at the node level too ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Nordström 2024-02-20 10:31:21 Missing Group Key in grouped aggregate
Previous Message David Rowley 2024-02-20 10:26:59 Re: JIT compilation per plan node