Re: use CLZ instruction in AllocSetFreeIndex()

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: use CLZ instruction in AllocSetFreeIndex()
Date: 2019-12-28 00:02:02
Message-ID: CACPNZCsNDdi8Jt=2eiEf+iBmkwG6kL8GmRRt=JPnVojthr20bQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 27, 2019 at 11:05 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> writes:
> > On Fri, Dec 27, 2019 at 9:54 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> ... but couldn't the
> >> right shift be elided in favor of changing the constant we
> >> subtract clz's result from? Shifting off those bits separately
> >> made sense in the old implementation, but assuming that CLZ is
> >> more or less constant-time, it doesn't with CLZ.
>
> > That makes sense -- I'll look into doing that.
>
> Actually, we could apply that insight to both code paths.
> In the existing path, that requires assuming
> ALLOCSET_NUM_FREELISTS+ALLOC_MINBITS <= 17, but that's OK.
> (Nowadays I'd probably add a StaticAssert about that.)

I tried that in the attached files and got these results:

current 6.14s
clz 4.52s
clz no right shift 3.15s
lookup table 5.56s
lookup table no right shift 7.34s

Here, "lookup table" refers to using the pg_leftmost_one_pos[] array
and incrementing the result. Removing the shift operation from the CLZ
case is clearly an improvement, and the main body goes from

movabsq $34359738367, %rax
addq %rax, %rdi
shrq $3, %rdi
bsrl %edi, %eax
xorl $-32, %eax
addl $33, %eax

to

decl %edi
bsrl %edi, %eax
xorl $-32, %eax
addl $30, %eax

The lookup table case is less clear. Removing the shift results in
assembly that looks more like the C code and is slower for me. The
standard lookup table code uses some magic constants and does its own
constant folding by shifting 11 (8 + 3). In the absence of testing on
platforms that will actually exercise this path, it seems the
open-coded path should keep the shift for now. Thoughts?

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v2-test_allocsetfreeindex.c application/octet-stream 3.5 KB
v2-test_allocsetfreeindex2.c application/octet-stream 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Maciek Sakrejda 2019-12-28 01:11:38 Re: Duplicate Workers entries in some EXPLAIN plans
Previous Message legrand legrand 2019-12-27 23:42:04 Re: Implementing Incremental View Maintenance