| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Lukas Fittl <lukas(at)fittl(dot)com> |
| Cc: | John Naylor <johncnaylorls(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Geier <geidav(dot)pg(at)gmail(dot)com> |
| Subject: | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Date: | 2026-04-07 04:55:45 |
| Message-ID: | znzsoci7b3crunqxf66xdylnllijbqij4vafos7yskmrbtqxhs@6hegy3qeu7e7 |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2026-04-06 20:41:46 -0700, Lukas Fittl wrote:
> On Mon, Apr 6, 2026 at 5:40 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I wonder if the cpuid tests should be a bit further abstracted into
> > pg_cpu_x86.c.
> >
> > E.g. instead of tsc_detect_frequency() checking for PG_RDTSCP,
> > PG_TSC_INVARIANT, PG_TSC_ADJUST we could have
> >
> > PG_TSC_AVAILABLE /* RDTSCP & INVARIANT */
> > PG_TSC_KNOWN_RELIABLE /* PG_TSC_AVAILABLE && PG_TSC_ADJUST */
> > PG_TSC_FREQUENCY_KNOWN /* x86_tsc_frequency_khz works */
> >
> > and always run all of that during set_x86_features().
>
> I think that could work, but I kept the flags in features closer to
> being direct mappings to CPUID bits since that seemed to be intent of
> how John designed the facility originally.
>
> John, do you have thoughts on this? (I've not changed it for now)
I'm ok either way.
> FWIW, I don't think having PG_TSC_KNOWN_RELIABLE makes sense in any
> case, because that would tie together x86_tsc_frequency_khz and
> set_x86_features, i.e. you'd either have the frequency return function
> modify X86Features later, or always run x86_tsc_frequency_khz when
> setting features (and that'd then require you to put the frequency
> value somewhere, etc.)
I was thinking the latter.
> I've gone ahead and rewritten that whole paragraph for clarity, and
> also split it into two. Feedback welcome:
>
> <para>
> If enabled, the TSC clock source will use specialized CPU instructions
> when measuring time intervals. This lowers timing overhead compared to
> reading the OS system clock, and reduces the measurement error on top
> of the actual runtime, for example with EXPLAIN ANALYZE.
> </para>
> <para>
> On x86-64 CPUs the TSC clock source utilizes the Time-Stamp Counter (TSC)
It's a bit weird that the third use of TSC in these paragraphs introduces
Time-Stamp Counter. I can see how you get there, but ...
Now I wonder if we should rename 'tsc' to 'cpu'...
> of the CPU. The RDTSC instruction is used to read the TSC for EXPLAIN ANALYZE.
> For timings that require higher precision the RDTSCP instruction is used,
> which avoids inaccuracies due to CPU instruction re-ordering. Use of
> RDTSC/RDTSCP is not supported on older x86-64 CPUs or hypervisors that don't
> pass the TSC frequency to guest VMs, and is not advised on systems that
s/guest VMs/virtual machines/?
> utilize an emulated TSC. The TSC clock source is currently not supported on
> other architectures.
The not support bit about hypervisors isn't quite right though? We do even use
it automatically if TSC_ADJUST is set (and the calibration loop succeeds).
> </para>
> <para>
> To help decide which clock source to use you can run the
> <application>pg_test_timing</application>
> utility to check TSC availability, and perform timing measurements.
> </para>
How about a link to to the pg_test_timing page? Hm, I guess that should also
be updated with new output.
I'd also sprinkle a few <acronym> and <command>s around.
Wonder if it's worth adding something like
<indexterm><primary><acronym>RDTSC</acronym></primary></indexterm>
<indexterm>
<primary>Time-Stamp Counter</primary>
<see><acronym>TSC</acronym></see>
</indexterm>
<indexterm><primary><acronym>TSC</acronym></primary></indexterm>
otherwise somebody seeing one of these in logs, pg_test_timing output or
whatever has even less of a chance to figure it out within our docs. They're
not hard to search for terms exactly, so ...
> I've also marked pg_get_ticks(_fast) as pg_attribute_always_inline,
> per an off-list comment from Andres that he observed GCC not fully
> inlining that function in pg_test_timing, presumably due to the
> likely(..) in it.
It's not the likely, I reproduced it even without that. I mouthed off about
compilers on mastodon and was kindly asked to just open a bug report :)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124795
I think discussing which indexterms should be added signals that this is
pretty close.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John Naylor | 2026-04-07 04:56:04 | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Previous Message | Tom Lane | 2026-04-07 04:44:47 | Re: Adding REPACK [concurrently] |