Re: JIT compiling with LLVM v9.0

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: JIT compiling with LLVM v9.0
Date: 2018-01-27 02:40:42
Message-ID: 20180127024042.kk24od3376gng57h@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-01-26 18:26:03 -0800, Jeff Davis wrote:
> On Wed, Jan 24, 2018 at 11:02 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Not entirely sure what you mean. You mean why I don't inline
> > slot_getsomeattrs() etc and instead generate code manually? The reason
> > is that the generated code is a *lot* smarter due to knowing the
> > specific tupledesc.
>
> I would like to see if we can get a combination of JIT and LTO to work
> together to specialize generic code at runtime.

Well, LTO can't quite work. It relies on being able to mark code in
modules linked together as externally visible - and cleary we can't do
that for a running postgres binary. At least in all incarnations I'm
aware of. But that's why the tree I posted supports inlining of code.

> Let's say you have a function f(int x, int y, int z). You want to be
> able to specialize it on y at runtime, so that a loop gets unrolled in
> the common case where y is small.
>
> 1. At build time, create bitcode for the generic implementation of f().
> 2. At run time, load the generic bitcode into a module (let's call it
> the "generic module")
> 3. At run time, create a new module (let's call it the "bind module")
> that only does the following things:
> a. declares a global variable bind_y, and initialize it to the value 3
> b. declares a wrapper function f_wrapper(int x, int z), and all the
> function does is call f(x, bind_y, z)
> 4. Link the generic module and the bind module together (let's call
> the result the "linked module")
> 5. Optimize the linked module

Afaict that's effectively what I've already implemented. We could export
more input as constants to the generated program, but other than that...

Whenever any extern functions are referenced, and jit_inlining=1, then
the code will see whether the called external code is available as jit
bitcode. Based on a simple instruction based cost limit that function
will get inlined (unless it references file local non-constant static
variables and such).

Now the JITed expressions tree currently makes it hard for LLVM to
recognize some constant input as constant, but what's largely needed for
that to be better is some improvements in where temporary values are
stored (should be in alloca's rather than local memory, so mem2reg can
do its thing). It's a TODO... Right now LLVM will figure out constant
inputs to non-strict functions, but not strict ones, but after fixing
some of what I've mentioned previously it works pretty universally.

Have I misunderstood adn there's some significant functional difference?

> I experimented a bit before and it works for basic cases, but I'm not
> sure if it's as good as your hand-generated LLVM.

For deforming it doesn't even remotely get as good in my experiments.

> If we can make this work, it would be a big win for
> readability/maintainability. The hand-generated LLVM is limited to the
> bind module, which is very simple, and doesn't need to be changed when
> the implementation of f() changes.

Right. Thats why I think we definitely want that for the large majority
of referenced functionality.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2018-01-27 04:01:08 Re: Setting BLCKSZ 4kB
Previous Message Jeff Davis 2018-01-27 02:26:03 Re: JIT compiling with LLVM v9.0