|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|To:||John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>|
|Cc:||David Fetter <david(at)fetter(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>|
|Subject:||Re: use CLZ instruction in AllocSetFreeIndex()|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> writes:
> v2 had an Assert that was only correct while experimenting with
> eliding right shift. Fixed in v3.
I think there must have been something wrong with your test that
said that eliminating the right shift from the non-CLZ code made
it slower. It should be an unconditional win, just as it is for
the CLZ code path. (Maybe some odd cache-line-boundary effect?)
Also, I think it's just weird to account for ALLOC_MINBITS one
way in the CLZ path and the other way in the other path.
I decided that it might be a good idea to do performance testing
in-place rather than in a standalone test program. I whipped up
the attached that just does a bunch of palloc/pfree cycles.
I got the following results on a non-cassert build (medians of
a number of tests; the times are repeatable to ~ 0.1% for me):
HEAD: 2429.431 ms
v3 CLZ: 2131.735 ms
v3 non-CLZ: 2477.835 ms
remove shift: 2266.755 ms
I didn't bother to try this on non-x86_64 architectures, as
previous testing convinces me the outcome should be about the
Hence, pushed that way, with a bit of additional cosmetic foolery:
the static assertion made more sense to me in relation to the
documented assumption that size <= ALLOC_CHUNK_LIMIT, and I
thought the comment could use some work.
regards, tom lane
|Next Message||Tom Lane||2019-12-28 22:52:14||Re: TAP testing for psql's tab completion code|
|Previous Message||Vik Fearing||2019-12-28 22:03:59||Re: Greatest Common Divisor|