Re: refactor architecture-specific popcount code

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: Greg Burd <greg(at)burd(dot)me>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: refactor architecture-specific popcount code
Date: 2026-02-20 15:39:38
Message-ID: aZiAOo25VBa6PoQi@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 20, 2026 at 03:21:05PM +0700, John Naylor wrote:
> On Thu, Feb 5, 2026 at 4:43 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>> Yeah, the plain C version might be marginally slower than the built-in
>> version for that test, but it still seems quite a bit faster than HEAD.
>>
>> HEAD v8 v10
>> 40 25 29
>
> (for the following, numbers are nanoseconds per call from
> drive_bms_num_members())
>
> Seems similar on S390X / gcc 13.3 (last week I only tested a single
> bitmapword and feel don't like repeating):
>
> master (older): 4.1083 (call builtin)
> v8: 2.8889 (inline builtin)
> v10: 2.7961 (inline pure C)

Thanks for testing it.

> On ppc64le / gcc 8.5, without native popcount it suffers:
>
> words master v14
> 1 4.5 6.5
> 2 5.8 9.7
> 64 67.9 101
> 128 143 190
>
> So one up, one down among obscure platforms. There seems to be a
> fairly thin case for the builtin anymore, although it's not zero.

I spent some time looking at how clang/gcc compiled the plain-C version on
various architectures [0], and I was pleasantly surprised to discover that
at some point in recent history they started automatically converting it to
special popcount instructions. I suspect that you'd see better results on
ppc64le if you upgraded the compiler...

[0] https://godbolt.org/z/v9vvx7E89

--
nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2026-02-20 15:48:40 Re: Show comments in \dRp+, \dRs+, and \dX+ psql meta-commands
Previous Message Vitaly Davydov 2026-02-20 15:07:07 Re: Support logical replication of DDLs