Making JIT more granular

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Making JIT more granular
Date: 2020-08-04 02:01:03
Message-ID: CAApHDvpQJqLrNOSi8P1JLM8YE2C+ksKFpSdZg=q6sTbtQ-v=aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

At the moment JIT compilation, if enabled, is applied to all
expressions in the entire plan. This can sometimes be a problem as
some expressions may be evaluated lots and warrant being JITted, but
others may only be evaluated just a few times, or even not at all.

This problem tends to become large when table partitioning is involved
as the number of expressions in the plan grows with each partition
present in the plan. Some partitions may have many rows and it can be
useful to JIT expression, but others may have few rows or even no
rows, in which case JIT is a waste of effort.

I recall a few cases where people have complained that JIT was too
slow. One case, in particular, is [1].

It would be nice if JIT was more granular about which parts of the
plan it could be enabled for. So I went and did that in the attached.

The patch basically changes the plan-level consideration of if JIT
should be enabled and to what level into a per-plan-node
consideration. So, instead of considering JIT based on the overall
total_cost of the plan, we just consider it on the plan-node's
total_cost.

I was just planing around with a test case of:

create table listp(a int, b int) partition by list(a);
select 'create table listp'|| x || ' partition of listp for values
in('||x||');' from generate_Series(1,1000) x;
\gexec
insert into listp select 1,x from generate_series(1,100000000) x;
vacuum analyze listp;

explain (analyze, buffers) select count(*) from listp where b < 0;

I get:

master jit=on
JIT:
Functions: 3002
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 141.587 ms, Inlining 11.760 ms, Optimization
6518.664 ms, Emission 3152.266 ms, Total 9824.277 ms
Execution Time: 12588.292 ms
(2013 rows)

master jit=off
Execution Time: 3672.391 ms

patched jit=on
JIT:
Functions: 5
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.675 ms, Inlining 3.322 ms, Optimization 10.766
ms, Emission 5.892 ms, Total 20.655 ms
Execution Time: 2754.160 ms

This explain format will need further work as each of those flags is
now per plan node rather than on the plan as a whole. I considered
just making the true/false a counter to count the number of functions,
e.g Inlined: 5 Optimized: 5 etc.

I understand from [2] that Andres has WIP code to improve the
performance of JIT compilation. That's really great, but I also
believe that no matter how fast we make it, it's going to be a waste
of effort unless the expressions are evaluated enough times for the
cheaper evaluations to pay off the compilation costs. It'll never be a
win when we evaluate certain expressions zero times. What Andres has
should allow us to drop the default jit costs.

Happy to hear people's thoughts on this.

David

[1] https://www.postgresql.org/message-id/7736C40E-6DB5-4E7A-8FE3-4B2AB8E22793@elevated-dev.com
[2] https://www.postgresql.org/message-id/20200728212806.tu5ebmdbmfrvhoao@alap3.anarazel.de

Attachment Content-Type Size
granular_jit_v1.patch application/octet-stream 12.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-08-04 02:21:28 Re: Cache relation sizes?
Previous Message Peter Geoghegan 2020-08-04 00:37:06 Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal