| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Lukas Fittl <lukas(at)fittl(dot)com> |
| Cc: | John Naylor <johncnaylorls(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Geier <geidav(dot)pg(at)gmail(dot)com> |
| Subject: | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Date: | 2026-03-08 16:39:47 |
| Message-ID: | opaq3twixq6uubmgclesklstm4cpe2mtmuwgm4pvsgoo33rep7@c3uph7pnlw4p |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2026-03-06 11:47:10 -0800, Lukas Fittl wrote:
> > But maybe we should just do the stupid thing and figure out the multiplier as
> > such:
> >
> > ns_to_cycles = tsc_via_rdtsc / to_ns(clock_gettime(CLOCK_BOOTTIME))
> >
> > in some quick experiments that ends up with a very good estimate. There would
> > have to be an awful long gap between the rdtsc and clock_gettime() computation
> > for the frequency to be meaningfully inaccurate.
>
> I think as long as the TSC counter and the clock boottime start at the
> same moment, that should work. But I'm not sure if we can rely on that
> to be the case in virtualized environments? I can do some more
> testing.
I did some testing, and unfortunately it's not good enough. There are several
issues:
- The tsc counter starts earlier than the OS, by enough to make counter
initially not quite right. It's not that bad on a laptop with a quick boot
time, but on a server with slower bios time initialization (e.g. due to
training of more memory) it's worse.
- If the server is rebooted not through a hard reset (the typical default),
but through something like kexec (which does not go through bios again), the
tsc counter is not reset.
> Alternatively, we could consider doing it like the Kernel does it for
> its calibration loop, and wait 1 second of wall time, and then see how
> far the TSC counter has advanced.
Yea, I think we need a calibration loop, unfortunately. But I think it should
be doable to make it a lot quicker than waiting one second. I'm thinking of
something like a loop that measures the the clock cycles and relative time
(using clock_gettime()) since the start and does so until the frequency
estimate predicts the time results closely. I think should be a few 10s of
milliseconds at most.
> FWIW, I ended up getting an x86 machine to be able to test these
> things better, and got myself an AMD CPU.
Dedication...
> Well, turns out that my
> non-virtualized AMD CPU ("AMD Ryzen™ AI Max+ 395") does not provide
> the TSC frequency via CPUID, at all :(
I can repro that on a somewhat older Zen 4 (7840U) laptop CPU.
> Instead on newer AMD CPUs you can use an MSR to get the TSC frequency,
> see [2]
:(
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2026-03-08 17:09:32 | Re: Addressing buffer private reference count scalability issue |
| Previous Message | jian he | 2026-03-08 16:16:08 | Re: Emitting JSON to file using COPY TO |