Re: Using POPCNT and other advanced bit manipulation instructions

From: Jose Luis Tallon <jltallon(at)adv-solutions(dot)net>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject: Re: Using POPCNT and other advanced bit manipulation instructions
Date: 2018-12-20 10:58:43
Message-ID: df95d1e8-8e63-256a-2d3e-ed2f8cff2f37@adv-solutions.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20/12/18 6:53, David Rowley wrote:
> Back in 2016 [1] there was some discussion about using the POPCNT
> instruction to improve the performance of counting the number of bits
> set in a word. Improving this helps various cases, such as
> bms_num_members and also things like counting the allvisible and
> frozen pages in the visibility map.
>
> [snip]
>
> I've put together a very rough patch to implement using POPCNT and the
> leading and trailing 0-bit instructions to improve the performance of
> bms_next_member() and bms_prev_member(). The correct function should
> be determined on the first call to each function by way of setting a
> function pointer variable to the most suitable supported
> implementation. I've not yet gone through and removed all the
> number_of_ones[] arrays to replace with a pg_popcount*() call.

IMVHO: Please do not disregard potential optimization by the compiler
around those calls.. o_0  That might explain the reduced performance
improvement observed.

Not that I can see any obvious alternative to your implementation right
away ...

> That
> seems to have mostly been done in Thomas' patch [3], part of which
> I've used for the visibilitymap.c code changes. If this patch proves
> to be possible, then I'll look at including the other changes Thomas
> made in his patch too.
>
> What I'm really looking for by posting now are reasons why we can't do
> this. I'm also interested in getting some testing done on older
> machines, particularly machines with processors that are from before
> 2007, both AMD and Intel.

I can offer a 2005-vintage Opteron 2216 rev3 (bought late 2007) to test
on. Feel free to toss me some test code.

cpuinfo flags:    fpu de tsc msr pae mce cx8 apic mca cmov pat clflush
mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow
rep_good nopl extd_apicid eagerfpu pni cx16 hypervisor lahf_lm
cmp_legacy 3dnowprefetch vmmcall

> 2007-2008 seems to be around the time both
> AMD and Intel added support for POPCNT and LZCNT, going by [4].

Thanks

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-12-20 11:49:58 Re: Tid scan improvements
Previous Message Dmitry Dolgov 2018-12-20 10:57:14 Re: Using POPCNT and other advanced bit manipulation instructions