|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|To:||Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>|
|Subject:||Re: Patch: fix lock contention for HASHHDR.mutex|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru> writes:
> Turns out PostgreSQL can spend a lot of time waiting for a lock in this
> particular place, especially if you are running PostgreSQL on 60-core
> server. Which obviously is a pretty bad sign.
> I managed to fix this behaviour by modifying choose_nelem_alloc
> procedure in dynahash.c (see attached patch).
TBH, this is just voodoo. I don't know why this change would have made
any impact on lock acquisition performance, and neither do you, and the
odds are good that it's pure chance that it changed anything. One likely
theory is that you managed to shift around memory allocations so that
something aligned on a cacheline boundary when it hadn't before. But, of
course, the next patch that changes allocations anywhere in shared memory
could change that back. There are lots of effects like this that appear
or disappear based on seemingly unrelated code changes when you're
measuring edge-case performance.
The patch is not necessarily bad in itself. As time goes by and machines
get bigger, it can make sense to allocate more memory at a time to reduce
memory management overhead. But arguing for it on the basis that it fixes
lock allocation behavior with 60 cores is just horsepucky. What you were
measuring there was steady-state hash table behavior, not the speed of the
allocate-some-more-memory code path.
regards, tom lane
|Next Message||Andreas Seltenreich||2015-12-11 15:47:17||Re: [sqlsmith] Failed to generate plan on lateral subqueries|
|Previous Message||Robert Haas||2015-12-11 15:12:46||Re: Remaining 9.5 open items|