From: | John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: use CLZ instruction in AllocSetFreeIndex() |
Date: | 2019-12-28 00:02:02 |
Message-ID: | CACPNZCsNDdi8Jt=2eiEf+iBmkwG6kL8GmRRt=JPnVojthr20bQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Dec 27, 2019 at 11:05 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> writes:
> > On Fri, Dec 27, 2019 at 9:54 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> ... but couldn't the
> >> right shift be elided in favor of changing the constant we
> >> subtract clz's result from? Shifting off those bits separately
> >> made sense in the old implementation, but assuming that CLZ is
> >> more or less constant-time, it doesn't with CLZ.
>
> > That makes sense -- I'll look into doing that.
>
> Actually, we could apply that insight to both code paths.
> In the existing path, that requires assuming
> ALLOCSET_NUM_FREELISTS+ALLOC_MINBITS <= 17, but that's OK.
> (Nowadays I'd probably add a StaticAssert about that.)
I tried that in the attached files and got these results:
current 6.14s
clz 4.52s
clz no right shift 3.15s
lookup table 5.56s
lookup table no right shift 7.34s
Here, "lookup table" refers to using the pg_leftmost_one_pos[] array
and incrementing the result. Removing the shift operation from the CLZ
case is clearly an improvement, and the main body goes from
movabsq $34359738367, %rax
addq %rax, %rdi
shrq $3, %rdi
bsrl %edi, %eax
xorl $-32, %eax
addl $33, %eax
to
decl %edi
bsrl %edi, %eax
xorl $-32, %eax
addl $30, %eax
The lookup table case is less clear. Removing the shift results in
assembly that looks more like the C code and is slower for me. The
standard lookup table code uses some magic constants and does its own
constant folding by shifting 11 (8 + 3). In the absence of testing on
platforms that will actually exercise this path, it seems the
open-coded path should keep the shift for now. Thoughts?
--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
v2-test_allocsetfreeindex.c | application/octet-stream | 3.5 KB |
v2-test_allocsetfreeindex2.c | application/octet-stream | 2.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Maciek Sakrejda | 2019-12-28 01:11:38 | Re: Duplicate Workers entries in some EXPLAIN plans |
Previous Message | legrand legrand | 2019-12-27 23:42:04 | Re: Implementing Incremental View Maintenance |