Re: effect of JIT tuple deform?

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: effect of JIT tuple deform?
Date: 2018-06-25 03:52:48
Message-ID: CAFj8pRAjYx5N+DAc+uKU0WocypmF0UGeP+qfUOXxBS0D1TV0Pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2018-06-25 5:41 GMT+02:00 Andres Freund <andres(at)anarazel(dot)de>:

> On 2018-06-25 05:32:37 +0200, Pavel Stehule wrote:
> > 2018-06-24 22:32 GMT+02:00 Dmitry Dolgov <9erthalion6(at)gmail(dot)com>:
> >
> > > > On 23 June 2018 at 08:47, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> > > >
> > > >
> > > > 2018-06-23 8:35 GMT+02:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> > > >>
> > > >> Hi
> > > >>
> > > >> I try to measure effect of JIT tuple deform and I don't see any
> possible
> > > >> effect.
> > > >>
> > > >> Is it this feature active in master branch?
> > > >>
> > > >> Is possible to see this feature in EXPLAIN ANALYZE?
> > > >
> > > >
> > > > Unfortunately I got slowdown
> > > >
> > > > 0. shared buffers = 1GB
> > > > 1. create table with 50 int columns
> > > > 2. insert into this table 4M rows
> > >
> > > Hi,
> > >
> > > Looks like I can reproduce the situation you're talking about (with
> some
> > > minor
> > > differences)
> > >
> > > =# explain analyze select sum(data45) from test_deform;
> > > QUERY PLAN
> > > ------------------------------------------------------------
> > > -------------------
> > > Finalize Aggregate
> > > (cost=127097.71..127097.72 rows=1 width=8)
> > > (actual time=813.957..813.957 rows=1 loops=1)
> > > -> Gather
> > > (cost=127097.50..127097.71 rows=2 width=8)
> > > (actual time=813.946..813.950 rows=3 loops=1)
> > > Workers Planned: 2
> > > Workers Launched: 2
> > > -> Partial Aggregate
> > > (cost=126097.50..126097.51 rows=1 width=8)
> > > (actual time=802.585..802.585 rows=1 loops=3)
> > > -> Parallel Seq Scan on test_deform
> > > (cost=0.00..121930.80 rows=1666680 width=4)
> > > (actual time=0.040..207.326 rows=1333333 loops=3)
> > > Planning Time: 0.212 ms
> > > JIT:
> > > Functions: 6
> > > Generation Time: 3.076 ms
> > > Inlining: false
> > > Inlining Time: 0.000 ms
> > > Optimization: false
> > > Optimization Time: 1.328 ms
> > > Emission Time: 8.601 ms
> > > Execution Time: 820.127 ms
> > > (16 rows)
> > >
> > > =# set jit_tuple_deforming to off;
> > >
> > > =# explain analyze select sum(data45) from test_deform;
> > > QUERY PLAN
> > > ------------------------------------------------------------
> > > -------------------
> > > Finalize Aggregate
> > > (cost=127097.71..127097.72 rows=1 width=8)
> > > (actual time=742.578..742.578 rows=1 loops=1)
> > > -> Gather
> > > (cost=127097.50..127097.71 rows=2 width=8)
> > > (actual time=742.529..742.569 rows=3 loops=1)
> > > Workers Planned: 2
> > > Workers Launched: 2
> > > -> Partial Aggregate
> > > (cost=126097.50..126097.51 rows=1 width=8)
> > > (actual time=727.876..727.876 rows=1 loops=3)
> > > -> Parallel Seq Scan on test_deform
> > > (cost=0.00..121930.80 rows=1666680 width=4)
> > > (actual time=0.044..204.097 rows=1333333 loops=3)
> > > Planning Time: 0.361 ms
> > > JIT:
> > > Functions: 4
> > > Generation Time: 2.840 ms
> > > Inlining: false
> > > Inlining Time: 0.000 ms
> > > Optimization: false
> > > Optimization Time: 0.751 ms
> > > Emission Time: 6.436 ms
> > > Execution Time: 749.827 ms
> > > (16 rows)
> > >
> > > But at the same time on the bigger dataset (20M rows instead of 4M) the
> > > very
> > > same query gets better:
> > >
> > > =# explain analyze select sum(data45) from test_deform;
> > > QUERY PLAN
> > > ------------------------------------------------------------
> > > -------------------
> > > Finalize Aggregate
> > > (cost=631482.92..631482.93 rows=1 width=8)
> > > (actual time=2350.288..2350.288 rows=1 loops=1)
> > > -> Gather
> > > (cost=631482.71..631482.92 rows=2 width=8)
> > > (actual time=2350.276..2350.279 rows=3 loops=1)
> > > Workers Planned: 2
> > > Workers Launched: 2
> > > -> Partial Aggregate
> > > (cost=630482.71..630482.72 rows=1 width=8)
> > > (actual time=2328.378..2328.379 rows=1 loops=3)
> > > -> Parallel Seq Scan on test_deform
> > > (cost=0.00..609649.37 rows=8333337 width=4)
> > > (actual time=0.027..1175.960 rows=6666667 loops=3)
> > > Planning Time: 0.600 ms
> > > JIT:
> > > Functions: 6
> > > Generation Time: 3.661 ms
> > > Inlining: true
> > > Inlining Time: 65.373 ms
> > > Optimization: true
> > > Optimization Time: 120.885 ms
> > > Emission Time: 58.836 ms
> > > Execution Time: 2543.280 ms
> > > (16 rows)
> > >
> > > =# set jit_tuple_deforming to off;
> > >
> > > =# explain analyze select sum(data45) from test_deform;
> > > QUERY PLAN
> > > ------------------------------------------------------------
> > > -------------------
> > > Finalize Aggregate
> > > (cost=631482.92..631482.93 rows=1 width=8)
> > > (actual time=3616.977..3616.977 rows=1 loops=1)
> > > -> Gather
> > > (cost=631482.71..631482.92 rows=2 width=8)
> > > (actual time=3616.959..3616.971 rows=3 loops=1)
> > > Workers Planned: 2
> > > Workers Launched: 2
> > > -> Partial Aggregate
> > > (cost=630482.71..630482.72 rows=1 width=8)
> > > (actual time=3593.220..3593.220 rows=1 loops=3)
> > > -> Parallel Seq Scan on test_deform
> > > (cost=0.00..609649.37 rows=8333337 width=4)
> > > (actual time=0.049..1027.353 rows=6666667 loops=3)
> > > Planning Time: 0.149 ms
> > > JIT:
> > > Functions: 4
> > > Generation Time: 1.803 ms
> > > Inlining: true
> > > Inlining Time: 23.529 ms
> > > Optimization: true
> > > Optimization Time: 18.843 ms
> > > Emission Time: 9.307 ms
> > > Execution Time: 3625.674 ms
> > > (16 rows)
> > >
> > > `perf diff` indeed shows that in the first case (with the 4M rows
> dataset)
> > > the
> > > jitted version has some noticeable delta for one call, and
> unfortunately
> > > so far
> > > I couldn't figure out which one exactly because of JIT (btw, who can
> > > explain
> > > how to see a correct full `perf report` in this case? Somehow `perf
> > > inject --jit -o
> > > perf.data.jitted` and jit_profiling_support didn't help).
> > >
> > > But since on the bigger dataset I've got expected results, maybe it's
> just
> > > a
> > > sign that JIT kicks in too early in this case and what's necessary is
> to
> > > adjust
> > > jit_above_cost/jit_optimize_above_cost/jit_inline_above_cost?
> > >
> >
> > maybe llvm does real compilation too late. It is too strange, because I
> > though about JIT cost like initial (fixed) costs. Now, it looks so this
> > cost is related to row numbers, and then the situation is much more
> complex.
>
> I don't understand what you're trying to say here. Compilation happens
> on the first execution of the function. It's a fixed cost. The point
> with more rows showing a benefit is that the benefit of JITing is bigger
> in total if the function is going to be executed more. And thus at some
> point the execution benefit is bigger than the cost of doing JITing.
>

Maybe I am little bit messy because I see slowdown about 60ms, but I don't
see this value in JIT statistics

Regards

Pavel

> Greetings,
>
> Andres Freund
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-06-25 05:00:36 Generating partitioning tuple conversion maps faster
Previous Message Andres Freund 2018-06-25 03:41:41 Re: effect of JIT tuple deform?