Re: JIT compilation per plan node

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: JIT compilation per plan node
Date: 2024-05-14 04:10:52
Message-ID: CAApHDvqLcfpYi1qGpOk20SOJutF23MQ3EG+aWeEJ8XuH+8sQPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 15 Mar 2024 at 10:13, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> To clarify, I think the patch is a step in the right direction, and a
> meaningful improvement. It may not be the perfect solution we imagine
> (but who knows how far we are from that), but AFAIK moving these
> decisions to the node level is something the ideal solution would need
> to do too.

I got thinking about this patch again while working on [1]. I want to
write this down as I don't quite have time to get fully back into this
right now...

Currently, during execution, ExecCreateExprSetupSteps() traverses the
Node tree of the Expr to figure out the max varattno of for each slot.
That's done so all of the tuple deforming happens at once rather than
incrementally. Figuring out the max varattno is a price we have to pay
for every execution of the query. I think we'd be better off doing
that in the planner.

To do this, I thought that setrefs.c could do this processing in
fix_join_expr / fix_upper_expr and wrap up the expression in a new
Node type that stores the max varattno for each special var type.

This idea is related to this discussion because another thing that
could be stored in the very same struct is the "num_exec" value. I
feel the number of executions of an ExprState is a better gauge of how
useful JIT will be than the cost of the plan node. Now, looking at
set_join_references(), the execution estimates are not exactly
perfect. For example;

#define NUM_EXEC_QUAL(parentplan) ((parentplan)->plan_rows * 2.0)

that's not a great estimate for a Nested Loop's joinqual, but we could
easily make efforts to improve those and that could likely be done
independently and concurrently with other work to make JIT more
granular.

The problem with doing this is that there's just a huge amount of code
churn in the executor. I am keen to come up with a prototype so I can
get a better understanding of if this is a practical solution. I
don't want to go down that path if it's just me that thinks the number
of times an ExprState is evaluated is a better measure to go on for
JIT vs no JIT than the plan node's total cost.

Does anyone have any thoughts on that idea?

David

[1] https://postgr.es/m/CAApHDvoexAxgQFNQD_GRkr2O_eJUD1-wUGm=m0L+Gc=T=kEa4g@mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-05-14 04:18:45 Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Previous Message Peter Smith 2024-05-14 04:06:34 Re: Slow catchup of 2PC (twophase) transactions on replica in LR