Re: JIT compiling with LLVM v12.2

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: JIT compiling with LLVM v12.2
Date: 2018-03-20 23:07:59
Message-ID: CAEepm=1OYvsASBWULP-9iypaXjRYPiKHg+7uueH4WfmKcX8JoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 20, 2018 at 11:14 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> - doc proofreading, addition of --with-llvm docs

The documentation builds and the resulting HTML looks good, and I like
what you've written for users and also for developers in the README
file. Perhaps it could use something about how to know it's working
with EXPLAIN (or any other introspection there might be), but maybe
you're still working on that?

I did a proof-reading pass and have some minor language and
typesetting suggestions. See comments below and attached patch
(against current HEAD of your jit branch) which implements all of
these changes, which of course you can feel free to take individual
hunks from or ignore if you disagree!

+ <varlistentry>
+ <term><acronym>JIT</acronym></term>
+ <listitem>
+ <para>
+ <ulink url="https://en.wikipedia.org/wiki/Just-in-time_compilation">Just
in Time
+ Compilation</ulink>
+ </para>
+ </listitem>
+ </varlistentry>

The usual typesetting seems to be "just-in-time" (with hyphens),
including on Wikipedia, various literature and in dictionaries. Here
"compilation" doesn't seem to need a capital letter (it's not part of
the acronym, it's not otherwise in a title context where
capitalisation is called for).

Similar comments apply to all other mentions, also changed (though I
won't repeat them here; see patch).

+ <title>JIT accelerated operations</title>

Although the existing documentation is not entirely consistent on this
point, it almost always follows the US convention for titles: initial
capitals except for a handful of small words ("is", "of", "to", ...),
like the New York Times and unlike the (London) Times. So here "JIT
Accelerated Operations". Same change elsewhere.

+ <title>What is <acronym>JIT</acronym></title>

Needs a question mark.

+ As explained in <xref linkend="jit-decision"/> the configuration variables
+ xref <xref linkend="guc-jit-above-cost"/>, <xref

Extra "xref" in the text.

+ <varlistentry id="guc-jit-above-cost" xreflabel="guc-jit-above-cost">

xreflabel should use underscores not hyphens, and shouldn't have the
leading "guc" (this breaks the resulting HTML).

+ Sets the planner's cutoff after which JIT compilation is used as part
...
+ Sets the planner's cutoff after which JIT compiled programs (see <xref

s/after which/above which/. I see there was some nearby text that
used "after which", but that was talking about time.

I think writers might do s/JIT compiled/JIT-compiled/ here and some
similar places (JIT-generated, JIT-accelerated etc), though I'm not
sure about that and I doubt anyone cares so I didn't change it.

+ available, but no error will be raised. This allows to install JIT
+ support separately from the main <productname>PostgreSQL</productname>
+ package.

"allows to ..." isn't correct English. You can say things like
"allows <object> to <infinitive>", and "allows installation ...", and
maybe "to allow installing ..." (though the last sounds a bit clumsy).
I rewrote it as "This allows JIT support to be installed separately
from ...". Same sort of thing in several places.

+ Writes the generated <productname>LLVM</productname> IR out to the
+ filesystem, inside <xref linkend="guc-data-directory"/>. This is only
+ useful for development of JIT.

Seems a little vague... maybe "... for working on the internals of the
JIT implementation"? Just to make clear it's not for end users unless
curious.

+ E.g. instead of using a facility that can evaluate arbitrary arbitrary SQL
+ expressions to evaluate an SQL predicate like <literal>WHERE a.col =

I'd write "For example" here. "E.g." seems more appropriate for
fitting into examples into tight spaces, like a remark in parentheses.
YMMV.

"arbitrary" is repeated.

+ Expression evaluation is used to evaluate <literal>WHERE</literal>
+ clauses, target lists, aggregates and projections. It can be accelerated
+ by generating code specific to the used expression.

How about "... by generating code specific to each case."

+ Tuple deforming is the process of transforming an on-disk tuple (see <xref
+ linkend="heaptuple"/>) into its in-memory representation. It can be
+ accelerated by creating a function specific to the table layout and the
+ number of to be extracted columns.

"... number of columns to be extracted."

+ <productname>LLVM</productname> has support for optimizing generated
+ code. Some of the optimizations are cheap enough to be performed whenever
+ <acronym>JIT</acronym> is used, others are only beneficial for more longer
+ running queries.

", while others are only beneficial for longer running queries."

+ <productname>PostgreSQL</productname> is very extensible and allows to
+ extend the set of datatypes, functions, operators, etc.; see <xref
+ linkend="extend"/>. In fact the builtin ones are implemented using nearly
+ the same mechanisms. This extensibility implies some overhead, e.g. due
+ to function calls (see <xref linkend="xfunc"/>). To reduce that overhead
+ <acronym>JIT</acronym> compilation can inline the body for small functions
+ into the expression using them. That allows to optimize away a significant
+ percentage of the overhead.

"... and allows new datatypes, functions, operators and other database
objects to be defined; ..."

"... can inline the body *of* small functions ..."

"... That allows a significant percentage of the overhead to be optimized away."

+ <acronym>JIT</acronym> is beneficial primarily for long-running, CPU bound,
+ queries. Frequently these will be analytical queries. For short queries

I'd lose those commas.

+ made. Firstly, if the query is more costly than the <xref
+ linkend="guc-jit-optimize-above-cost"/> GUC expensive optimizations are

I'd add a comma (and maybe "then") before "GUC".

+ For development and debugging purposes a few additional GUCs exist. <xref
+ linkend="guc-jit-dump-bitcode"/> allows to inspect the generated
+ bitcode. <xref linkend="guc-jit-debugging-support"/> allows GDB to see
+ generated functions. <xref linkend="guc-jit-profiling-support"/> emits
+ information so the <productname>perf</productname> profiler can interpret
+ JIT generated functions sensibly.

"... allows the generated bitcode to be inspected."

+ <programlisting>
+struct JitProviderCallbacks
+{
+ JitProviderResetAfterErrorCB reset_after_error;
+ JitProviderReleaseContextCB release_context;
+ JitProviderCompileExprCB compile_expr;
+};
+extern void _PG_jit_provider_init(JitProviderCallbacks *cb);
+ </programlisting>

Some weird tabs in here. Changed to spaces.

About the README, some of this text is similar to the user-facing docs
and the same comments apply, and there are also some nitpicks about
apostrophes etc that I won't bother to repeat here.

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
jit-doc-tweaks.patch application/octet-stream 14.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-03-20 23:18:29 Re: JIT compiling with LLVM v12.2
Previous Message Stephen Frost 2018-03-20 21:44:22 Re: PATCH: Configurable file mode mask