| From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
|---|---|
| To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
| Cc: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: micro-optimize nbtcompare.c routines |
| Date: | 2024-09-27 03:23:43 |
| Message-ID: | ZvYlP4IXbZoI4pQr@nathan |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Sep 27, 2024 at 02:50:13PM +1200, David Rowley wrote:
> I had been looking at [1] (which I've added your version to now). I
> had been surprised to see gcc emitting different code for the first 3
> versions. Clang does a better job at figuring out they all do the same
> thing and emitting the same code for each.
Interesting.
> I played around with the attached (hacked up) qsort.c to see if there
> was any difference. Likely function call overhead kills the
> performance anyway. There does not seem to be much difference between
> them. I've not tested with an inlined comparison function.
I'd expect worse performance with the branchless routines for the inlined
case. However, I recall that clang was able to optimize med3() as well as
it can with the branching routines, so that may not always be true.
> Looking at your version, it doesn't look like there's any sort of
> improvement in terms of the instructions. Certainly, for clang, it's
> worse as it adds a shift left instruction and an additional compare.
> No jumps, at least.
I think I may have forgotten to add -O2 when I was inspecting this code
with godbolt.org earlier. *facepalm* The different versions look pretty
comparable with that added.
> What's your reasoning for returning INT_MIN and INT_MAX?
That's just for the compile option added by commit c87cb5f, which IIUC is
intended to test that we correctly handle comparisons that return INT_MIN.
--
nathan
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Yugo NAGATA | 2024-09-27 03:26:22 | Re: Add has_large_object_privilege function |
| Previous Message | David Rowley | 2024-09-27 02:50:13 | Re: micro-optimize nbtcompare.c routines |