Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?

From: Lukas Fittl <lukas(at)fittl(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Geier <geidav(dot)pg(at)gmail(dot)com>
Subject: Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Date: 2026-04-02 08:50:08
Message-ID: CAP53PkwLjAz5A8+wHU5G4ZNVT5ACBLZqG1hGg66y6OZXLvr68w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 1, 2026 at 7:18 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> -# Check for __get_cpuid_count() and __cpuidex() in a similar fashion.
> +# Check for __get_cpuid_count() and __cpuidex() separately, since we sometimes
> +# need __cpuidex() even if __get_cpuid_count() is available.
>
> I'm not sure we need to document here that we need them separately. It
> seems more important to do that for the __cpuidex helper that's in a
> later patch. I don't feel strongly, though.
>
> I'd probably just do
>
> # Check for __get_cpuid_count()
> ...
>
> # Check for __cpuidex()
> ...
>
> Either way, we do need a space between separate tests (ditto Meson).

Makes sense, updated.

>
> +if cc.links('''
> + #ifdef _MSC_VER
> #include <intrin.h>
> + #else
> + #include <cpuid.h>
> + #endif
> int main(int arg, char **argv)
> {
> - unsigned int exx[4] = {0, 0, 0, 0};
> + int exx[4] = {0, 0, 0, 0};
> __cpuidex(exx, 7, 0);
> }
>
> Hmm, maybe we can declare unsigned and cast to signed for MSVC as we
> do where the code is used.

Sure, I think it works either way - updated.

>
> +# __cpuidex()
> +AC_CACHE_CHECK([for __cpuidex], [pgac_cv__cpuidex],
> +[AC_LINK_IFELSE([AC_LANG_PROGRAM([#ifdef _MSC_VER
> + #include <intrin.h>
> + #else
> + #include <cpuid.h>
> + #endif],
> + [[int exx[4] = {0, 0, 0, 0};
> + __cpuidex(exx, 7, 0);
> + ]])],
> + [pgac_cv__cpuidex="yes"],
> + [pgac_cv__cpuidex="no"])])
> +if test x"$pgac_cv__cpuidex" = x"yes"; then
> + AC_DEFINE(HAVE__CPUIDEX, 1, [Define to 1 if you have __cpuidex.])
> fi
>
> MSVC doesn't use autoconf, so we can leave out the intrin.h and keep
> using unsigned int.

I think that might be a problem for Clang on Windows, which has MSVC
compatibility headers, and will set _MSC_VER, see [0] and [1].

It probably works in practice, but it seems better to match the
autoconf checks to the actual code?

>
> -#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
> -#include <cpuid.h>
> -#endif
> -
> -#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
> +#if defined(HAVE__CPUID) || defined(HAVE__GET_CPUID) ||
> defined(HAVE__GET_CPUID_COUNT) || defined(HAVE__CPUIDEX)
> +#if defined(_MSC_VER)
> #include <intrin.h>
> +#else
> +#include <cpuid.h>
> +#endif /* defined(_MSC_VER) */
> #endif
>
> Now I'm thinking at least one of these will be defined, otherwise we'd
> have an #error, so we can just gate solely on _MSC_VER.

Yeah, makes sense, we would fail to compile in that case anyway.

I'm attaching v15 rebased, with the above changes made to 0001 to
unblock things, but feedback by Andres not addressed yet (will work on
that tomorrow).

John, feel free to adjust as you see fit in case there is any other
minor aspects to fix.

Thanks,
Lukas

[0]: https://clang.llvm.org/docs/UsersManual.html#microsoft-extensions
[1]: https://clang.llvm.org/doxygen/intrin_8h_source.html

--
Lukas Fittl

Attachment Content-Type Size
v15-0001-Check-for-HAVE__CPUIDEX-and-HAVE__GET_CPUID_COUN.patch application/octet-stream 5.9 KB
v15-0004-pg_test_timing-Also-test-RDTSC-RDTSCP-timing-and.patch application/octet-stream 6.2 KB
v15-0003-instrumentation-Use-Time-Stamp-Counter-TSC-on-x8.patch application/octet-stream 38.2 KB
v15-0002-instrumentation-Streamline-ticks-to-nanosecond-c.patch application/octet-stream 13.1 KB
v15-0005-instrumentation-ARM-support-for-fast-time-measur.patch application/octet-stream 8.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2026-04-02 09:21:20 RE: PROPOSAL for when publication row filter columns are not in replica identity (BUG #19434)
Previous Message Amit Kapila 2026-04-02 08:33:09 Re: DOC: pg_publication_rel.prrelid says sequences are possible