| From: | Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com> |
|---|---|
| To: | Lukas Fittl <lukas(at)fittl(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, John Naylor <johncnaylorls(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, David Geier <geidav(dot)pg(at)gmail(dot)com> |
| Subject: | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Date: | 2026-04-07 07:32:43 |
| Message-ID: | CAN4CZFPDWoXTQHSd8xhv_Q9UmWX2QunMX-cKD_UTenzbcY4PeQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> Its intentionally uint64, per this comment above it:
>
> * Note we utilize unsigned integers even though ticks are stored as a signed
> * value to encourage compilers to generate better assembly, since we can be
> * sure these values are not negative.
>
> In my earlier Compiler Explorer tests that did actually make a
> difference for the generated assembly.
Isn't that comment more about ticks_per_ns_scaled?
For max_ticks_no_overflow the only use is with a cast to int64, so I
didn't expect much assembly difference. Now I actually checked
locally/godbolt, and I don't see any actual differences. Making
max_ticks_no_overflow int64 and removing that cast generates exactly
the same code.
For ticks_per_ns_scaled, gcc 9-10 actually generates +1 mov
instruction with int64, but that's not present in more recent
versions.
Recent compiler versions only have an idiv/div and shr/sar difference.
Idiv is slower than div on intel, so that is a point for keeping
ticks_per_ns_scaled unsigned.
For arm I see the same lsr/asr and udiv/sdiv difference.
https://godbolt.org/z/4r5GTbrs3
(the main gcc vs clang difference seems to be clang's 32 bit division
optimization)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2026-04-07 07:39:50 | Re: using index to speedup add not null constraints to a table |
| Previous Message | Masahiko Sawada | 2026-04-07 07:28:56 | Re: test_autovacuum/001_parallel_autovacuum is broken |