Re: JIT compiling with LLVM v9.0

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: JIT compiling with LLVM v9.0
Date: 2018-01-31 01:42:26
Message-ID: CAEepm=3t+taENH6Z6VgkuY55b5PaH9Drnfs7Oyg+1nh4tT6M8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 31, 2018 at 11:57 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2018-01-30 13:46:37 -0500, Robert Haas wrote:
>> On Mon, Jan 29, 2018 at 1:40 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> > It's an optional dependency, and it doesn't increase build time that
>> > much... If we were to move the llvm interfacing code to a .so, there'd
>> > not even be a packaging issue, you can just package that .so separately
>> > and get errors if somebody tries to enable LLVM without that .so being
>> > installed.
>>
>> I suspect that would be really valuable. If 'yum install
>> postgresql-server' (or your favorite equivalent) sucks down all of
>> LLVM, some people are going to complain, either because they are
>> trying to build little tiny machine images or because they are subject
>> to policies which preclude the presence of a compiler on a production
>> server. If you can do 'yum install postgresql-server' without
>> additional dependencies and 'yum install postgresql-server-jit' to
>> make it go faster, that issue is solved.
>
> So, I'm working on that now. In the course of this I'll be
> painfully rebase and rename a lot of code, which I'd like not to repeat
> unnecessarily.
>
> Right now there primarily is:
>
> src/backend/lib/llvmjit.c - infrastructure, optimization, error handling
> src/backend/lib/llvmjit_{error,wrap,inline}.cpp - expose more stuff to C
> src/backend/executor/execExprCompile.c - emit LLVM IR for expressions
> src/backend/access/common/heaptuple.c - emit LLVM IR for deforming
>
> Given that we need a shared library it'll be best buildsystem wise if
> all of this is in a directory, and there's a separate file containing
> the stubs that call into it.
>
> I'm not quite sure where to put the code. I'm a bit inclined to add a
> new
> src/backend/jit/
> because we're dealing with code from across different categories? There
> we could have a pgjit.c with the stubs, and llvmjit/ with the llvm
> specific code?
>
> Alternatively I'd say we put the stub into src/backend/executor/pgjit.c,
> and the actual llvm using code into src/backend/executor/llvmjit/?
>
> Comments?

I'm just starting to look at this (amazing) work, and I don't have a
strong opinion yet. But certainly, making it easy for packagers to
put the -jit stuff into a separate package for the reasons already
given sounds sensible to me. Some systems package LLVM as one
gigantic package that'll get you 1GB of compiler/debugger/other stuff
and perhaps violate local rules by installing a compiler when you
really just wanted libLLVM{whatever}.so. I guess it should be made
very clear to users (explain plans, maybe startup message, ...?)
whether JIT support is active/installed so that people are at least
very aware when they encounter a system that is interpreting stuff it
could be compiling. Putting all the JIT into a separate directory
under src/backend/jit certainly looks sensible at first glance, but
I'm not sure.

Incidentally, from commit fdc6c7a6dddbd6df63717f2375637660bcd00fc6
(HEAD -> jit, andresfreund/jit) on your branch I get:

ccache c++ -Wall -Wpointer-arith -fno-strict-aliasing -fwrapv -g -g
-O2 -fno-exceptions -I../../../src/include
-I/usr/local/llvm50/include -DLLVM_BUILD_GLOBAL_ISEL
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-I/usr/local/include -c -o llvmjit_error.o llvmjit_error.cpp -MMD -MP
-MF .deps/llvmjit_error.Po
In file included from llvmjit_error.cpp:26:
In file included from ../../../src/include/lib/llvmjit.h:48:
In file included from /usr/local/llvm50/include/llvm-c/Types.h:17:
In file included from /usr/local/llvm50/include/llvm/Support/DataTypes.h:33:
/usr/include/c++/v1/cmath:555:1: error: templates must have C++ linkage
template <class _A1>
^~~~~~~~~~~~~~~~~~~~
llvmjit_error.cpp:24:1: note: extern "C" language linkage
specification begins here
extern "C"
^

$ c++ -v
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on
LLVM 4.0.0)

This seems to be a valid complaint. I don't think you should be
(indirectly) wrapping Types.h in extern "C". At a guess, your
llvmjit.h should be doing its own #ifdef __cplusplus'd linkage
specifiers, so you can use it from C or C++, but making sure that you
don't #include LLVM's headers from a bizarro context where __cplusplus
is defined but the linkage is unexpectedly already "C"?

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2018-01-31 01:43:28 Re: [HACKERS] GUC for cleanup indexes threshold.
Previous Message Masahiko Sawada 2018-01-31 01:38:46 Re: User defined data types in Logical Replication