Re: bad JIT decision

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Scott Ribe <scott_ribe(at)elevated-dev(dot)com>, PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: bad JIT decision
Date: 2020-08-02 22:21:38
Message-ID: CAApHDvqXc4ZkoCSxxLO_Tf-zNr1yyb_C-d=ncj-Q_pbxVOioFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 29 Jul 2020 at 09:07, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2020-07-28 11:54:53 +1200, David Rowley wrote:
> > Is there some reason that we can't consider jitting on a more granular
> > basis?
>
> There's a substantial "constant" overhead of doing JIT. And that it's
> nontrival to determine which parts of the query should be JITed in one
> part, and which not.
>
>
> > To me, it seems wrong to have a jit cost per expression and
> > demand that the plan cost > #nexprs * jit_expr_cost before we do jit
> > on anything. It'll make it pretty hard to predict when jit will occur
> > and doing things like adding new partitions could suddenly cause jit
> > to not enable for some query any more.
>
> I think that's the right answer though:

I'm not quite sure why it would be so hard to do more granularly.

Take this case, for example:

create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
create table listp2 partition of listp for values in(2);
insert into listp select 1,x from generate_Series(1,1000000) x;

The EXPLAIN looks like:

postgres=# explain select * from listp where b < 100;
QUERY PLAN
--------------------------------------------------------------------------
Append (cost=0.00..16967.51 rows=853 width=8)
-> Seq Scan on listp1 listp_1 (cost=0.00..16925.00 rows=100 width=8)
Filter: (b < 100)
-> Seq Scan on listp2 listp_2 (cost=0.00..38.25 rows=753 width=8)
Filter: (b < 100)
(5 rows)

For now, if the total cost of the plan exceeded the jit threshold,
then we'd JIT all the expressions. If it didn't, we'd compile none of
them.

What we could do instead would just add the jitFlags field into struct
Plan to indicate the JIT flags on a per plan node level and enable it
as we do now based on the total_cost of that plan node rather than at
the top-level of the plan as we do now in standard_planner(). The
jitFlags setting code would be moved to the end of
create_plan_recurse() instead.

In this case, if we had the threshold set to 10000, then we'd JIT for
listp1 but not for listp2. I don't think this would even require a
signature change in the jit_compile_expr() function as we can get
access to the plan node from state->parent->plan to see which jitFlags
are set, if any.

David

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Paquier 2020-08-03 01:58:08 Re: how reliable is pg_rewind?
Previous Message Tom Lane 2020-08-01 18:31:09 Re: Apparent missed query optimization with self-join and inner grouping