Re: Stampede of the JIT compilers

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Pirotte <dpirotte(at)gmail(dot)com>
Subject: Re: Stampede of the JIT compilers
Date: 2023-06-24 17:12:29
Message-ID: CAAaqYe-mU-KepwZma5cf=8-uW+LkNOqiXYzozQEjdi-1_o+phw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 24, 2023 at 7:40 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
>
>
> On 6/24/23 02:33, David Rowley wrote:
> > On Sat, 24 Jun 2023 at 02:28, James Coleman <jtc331(at)gmail(dot)com> wrote:
> >> There are a couple of issues here. I'm sure it's been discussed
> >> before, and it's not the point of my thread, but I can't help but note
> >> that the default value of jit_above_cost of 100000 seems absurdly low.
> >> On good hardware like we have even well-planned queries with costs
> >> well above that won't be taking as long as JIT compilation does.
> >
> > It would be good to know your evidence for thinking it's too low.

It's definitely possible that I stated this much more emphatically
than I should have -- it was coming out of my frustration with this
situation after all.

I think, though, that my later comments here will provide some
philosophical justification for it.

> > The main problem I see with it is that the costing does not account
> > for how many expressions will be compiled. It's quite different to
> > compile JIT expressions for a query to a single table with a simple
> > WHERE clause vs some query with many joins which scans a partitioned
> > table with 1000 partitions, for example.
> >
>
> I think it's both - as explained by James, there are queries with much
> higher cost, but the JIT compilation takes much more than just running
> the query without JIT. So the idea that 100k difference is clearly not
> sufficient to make up for the extra JIT compilation cost.
>
> But it's true that's because the JIT costing is very crude, and there's
> little effort to account for how expensive the compilation will be (say,
> how many expressions, ...).
>
> IMHO there's no "good" default that wouldn't hurt an awful lot of cases.
>
> There's also a lot of bias - people are unlikely to notice/report cases
> when the JIT (including costing) works fine. But they sure are annoyed
> when it makes the wrong choice.
>
> >> But on the topic of the thread: I'd like to know if anyone has ever
> >> considered implemented a GUC/feature like
> >> "max_concurrent_jit_compilations" to cap the number of backends that
> >> may be compiling a query at any given point so that we avoid an
> >> optimization from running amok and consuming all of a servers
> >> resources?
> >
> > Why do the number of backends matter? JIT compilation consumes the
> > same CPU resources that the JIT compilation is meant to save. If the
> > JIT compilation in your query happened to be a net win rather than a
> > net loss in terms of CPU usage, then why would
> > max_concurrent_jit_compilations be useful? It would just restrict us
> > on what we could save. This idea just covers up the fact that the JIT
> > costing is disconnected from reality. It's a bit like trying to tune
> > your radio with the volume control.
> >
>
> Yeah, I don't quite get this point either. If JIT for a given query
> helps (i.e. makes execution shorter), it'd be harmful to restrict the
> maximum number of concurrent compilations. It we just disable JIT after
> some threshold is reached, that'd make queries longer and just made the
> pileup worse.

My thought process here is that given the poor modeling of JIT costing
you've both described that we're likely to estimate the cost of "easy"
JIT compilation acceptably well but also likely to estimate "complex"
JIT compilation far lower than actual cost.

Another way of saying this is that our range of JIT compilation costs
may well be fine on the bottom end but clamped on the high end, and
that means that our failure modes will tend towards the worst
mis-costings being the most painful (e.g., 2s compilation time for a
100ms query). This is even more the case in an OLTP system where the
majority of queries are already known to be quite fast.

In that context capping the number of backends compiling, particularly
where plans (and JIT?) might be cached, could well save us (depending
on workload).

That being said, I could imagine an alternative approach solving a
similar problem -- a way of exiting early from compilation if it takes
longer than we expect.

> If it doesn't help for a given query, we shouldn't be doing it at all.
> But that should be based on better costing, not some threshold.
>
> In practice there'll be a mix of queries where JIT does/doesn't help,
> and this threshold would just arbitrarily (and quite unpredictably)
> enable/disable costing, making it yet harder to investigate slow queries
> (as if we didn't have enough trouble with that already).
>
> > I think the JIT costs would be better taking into account how useful
> > each expression will be to JIT compile. There were some ideas thrown
> > around in [1].
> >
>
> +1 to that

That does sound like an improvement.

One thing about our JIT that is different from e.g. browser JS engine
JITing is that we don't substitute in the JIT code "on the fly" while
execution is already underway. That'd be another, albeit quite
difficult, way to solve these issues.

Regards,
James Coleman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-06-24 17:54:53 Re: Stampede of the JIT compilers
Previous Message David G. Johnston 2023-06-24 15:57:23 Re: psql: Add role's membership options to the \du+ command