Re: Stampede of the JIT compilers

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Pirotte <dpirotte(at)gmail(dot)com>
Subject: Re: Stampede of the JIT compilers
Date: 2023-06-25 18:06:02
Message-ID: 563378.1687716362@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> I've seen plenty of other reports and I do agree there is a problem,
> but I think you're jumping to conclusions in this particular case.
> I've seen nothing here that couldn't equally indicate the planner
> didn't overestimate the costs or some row estimate for the given
> query. The solution to those problems shouldn't be bumping up the
> default JIT thresholds it could be to fix the costs or tune/add
> statistics to get better row estimates.
> I don't think it's too big an ask to see a few more details so that we
> can confirm what the actual problem is.

Okay, I re-did the regression tests with log_min_duration_statement set to
zero, and then collected the reported runtimes. (This time, the builds
also had --enable-cassert turned off, unlike my quick check yesterday.)
I attach the results for anyone interested in doing their own analysis,
but my preliminary impression is:

(1) There is *no* command in the core regression tests where it makes
sense to invoke JIT. This is unsurprising really, because we don't
allow any long-running queries there. The places where the time
with LLVM beats the time without LLVM look to be noise. (I didn't
go so far as to average the results from several runs, but perhaps
someone else would wish to.)

(2) Nonetheless, we clearly do invoke JIT in some places, and it adds
as much as a couple hundred ms to what had been a query requiring a few
ms. I've investigated several of the ones with the worst time penalties,
and they indeed look to be estimation errors. The planner is guessing
that a join for which it lacks any stats will produce some tens of
thousands of rows, which it doesn't really, but that's enough to persuade
it to apply JIT.

(3) I still think this is evidence that the cost thresholds are too low,
because even if these joins actually did produce some tens of thousands
of rows, I think we'd be well shy of breakeven to use JIT. We'd have
to do some more invasive testing to prove that guess, of course. But
it looks to me like the current situation is effectively biased towards
using JIT when we're in the gray zone, and we'd be better off reversing
that bias.

regards, tom lane

Attachment Content-Type Size
timings.dump text/plain 4.9 MB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Steve Chavez 2023-06-25 18:36:15 Fwd: Castable Domains for different JSON representations
Previous Message Noah Misch 2023-06-25 17:13:24 Re: vac_truncate_clog()'s bogus check leads to bogusness