Re: Misaligned BufferDescriptors causing major performance problems on AMD

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Misaligned BufferDescriptors causing major performance problems on AMD
Date: 2014-02-03 23:17:13
Message-ID: CAM3SWZTv=OJLkGYK0AA6yd-2gKs1A2xMmgpqec4yGWBvasM+5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 2, 2014 at 7:13 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Just as reference, we're talking about a performance degradation from
> 475963.613865 tps to 197744.913556 in a pgbench -S -cj64 just by setting
> max_connections to 90, from 91...

That's pretty terrible.

> So, I looked into this, and I am fairly certain it's because of the
> (mis-)alignment of the buffer descriptors. With certain max_connections
> settings InitBufferPool() happens to get 64byte aligned addresses, with
> others not. I checked the alignment with gdb to confirm that.

I find your diagnosis to be quite plausible.

> A quick hack (attached) making BufferDescriptor 64byte aligned indeed
> restored performance across all max_connections settings. It's not
> surprising that a misaligned buffer descriptor causes problems -
> there'll be plenty of false sharing of the spinlocks otherwise. Curious
> that the the intel machine isn't hurt much by this.

I think that is explained here:

http://www.agner.org/optimize/blog/read.php?i=142&v=t

With Sandy Bridge, "Misaligned memory operands [are] handled efficiently".

> Now all this hinges on the fact that by a mere accident
> BufferDescriptors are 64byte in size:

Are they 64 bytes in size on REL9_*_STABLE? How about on win64? I
think we're reasonably disciplined here already, but long is 32-bits
in length even on win64. Looks like it would probably be okay, but as
you say, it doesn't seem like something to leave to chance.

> We could polish up the attached patch and apply it to all the branches,
> the costs of memory are minimal. But I wonder if we shouldn't instead
> make ShmemInitStruct() always return cacheline aligned addresses. That
> will require some fiddling, but it might be a good idea nonetheless?

What fiddling are you thinking of?

> I think we should also consider some more reliable measures to have
> BufferDescriptors cacheline sized, rather than relying on the happy
> accident. Debugging alignment issues isn't fun, too much of a guessing
> game...

+1. Maybe make code that isn't appropriately aligned fail to compile?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-02-03 23:28:33 Re: Wait free LW_SHARED acquisition - v0.2
Previous Message Jeff Janes 2014-02-03 23:17:05 Re: Wait free LW_SHARED acquisition - v0.2