Re: Lazy JIT IR code generation to increase JIT speed with partitions

From: Luc Vlaming Hummel <luc(dot)vlaming(at)servicenow(dot)com>
To: David Geier <geidav(dot)pg(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lazy JIT IR code generation to increase JIT speed with partitions
Date: 2022-07-04 06:43:00
Message-ID: CO3PR08MB7990B47B21621BAEB583D63D9CBE9@CO3PR08MB7990.namprd08.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alvaro, hi David,

Thanks for reviewing this and the interesting examples!

Wanted to give a bit of extra insight as to why I'd love to have a system that can lazily emit JIT code and hence creates roughly a module per function:
In the end I'm hoping that we can migrate to a system where we only JIT after a configurable cost has been exceeded for this node, as well as a configurable amount of rows has actually been processed.
Reason is that this would safeguard against some problematic planning issues wrt JIT (node not being executed, row count being massively off).
It would also allow for more finegrained control, with a cost system similar to most other planning costs, as they are also per node and not globally, and would potentially allow us to only JIT things where we expect to truly gain any runtime compared to the costs of doing it.

If this means we have to invest more in making it cheap(er) to emit modules, I'm all for that. Kudos to David for fixing the caching in that sense :) @Andres if there's any other things we ought to fix to make this cheap (enough) compared to the previous code I'd love to know your thoughts.

Best,
Luc Vlaming
(ServiceNow)

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
Sent: Wednesday, June 29, 2022 11:03 AM
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Luc Vlaming <luc(at)swarm64(dot)com>; Andres Freund <andres(at)anarazel(dot)de>; PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lazy JIT IR code generation to increase JIT speed with partitions
 
[External Email]
 
Hi Alvaro,

That's a very interesting case and might indeed be fixed or at least improved by this patch. I tried to reproduce this, but at least when running a simple, serial query with increasing numbers of functions, the time spent per function is linear or even slightly sub-linear (same as Tom observed in [1]).

I also couldn't reproduce the JIT runtimes you shared, when running the attached catalog query. The catalog query ran serially for me with the following JIT stats:

 JIT:
   Functions: 169
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 12.223 ms, Inlining 17.323 ms, Optimization 388.491 ms, Emission 283.464 ms, Total 701.501 ms

Is it possible that the query ran in parallel for you? For parallel queries, every worker JITs all of the functions it uses. Even though the workers might JIT the functions in parallel, the time reported in the EXPLAIN ANALYZE output is the sum of the time spent by all workers. With this patch applied, the JIT time drops significantly, as many of the generated functions remain unused.

 JIT:
   Modules: 15
   Functions: 26
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 1.931 ms, Inlining 0.722 ms, Optimization 67.195 ms, Emission 70.347 ms, Total 140.195 ms

Of course, this does not prove that the nonlinearity that you observed went away. Could you share with me how you ran the query so that I can reproduce your numbers on master to then compare them with the patched version? Also, which LLVM version did you run with? I'm currently running with LLVM 13.

Thanks!

--
David Geier
(ServiceNow)

On Mon, Jun 27, 2022 at 5:37 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
On 2021-Jan-18, Luc Vlaming wrote:

> I would like this topic to somehow progress and was wondering what other
> benchmarks / tests would be needed to have some progress? I've so far
> provided benchmarks for small(ish) queries and some tpch numbers, assuming
> those would be enough.

Hi, some time ago I reported a case[1] where our JIT implementation does
a very poor job and perhaps the changes that you're making could explain
what is going on, and maybe even fix it:

[1] https://postgr.es/m/202111141706.wqq7xoyigwa2@alvherre.pgsql

The query for which I investigated the problem involved some pg_logical
metadata tables, so I didn't post it anywhere public; but the blog post
I found later contains a link to a query that shows the same symptoms,
and which is luckily still available online:
https://gist.github.com/saicitus/251ba20b211e9e73285af35e61b19580
I attach it here in case it goes missing sometime.

--
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-07-04 06:47:08 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Noah Misch 2022-07-04 05:59:22 Re: Probable memory leak with ECPG and AIX