Re: Memory Alignment in Postgres

From: Arthur Silva <arthurprs(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory Alignment in Postgres
Date: 2014-09-11 13:32:24
Message-ID: CAO_YK0V7v8fLYzbnjvewM+U_6g5gfexWXq0-0RbP9TpiKOk+Sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 10, 2014 at 12:43 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Sep 9, 2014 at 10:08 AM, Arthur Silva <arthurprs(at)gmail(dot)com> wrote:
> > I'm continuously studying Postgres codebase. Hopefully I'll be able to
> make
> > some contributions in the future.
> >
> > For now I'm intrigued about the extensive use of memory alignment. I'm
> sure
> > there's some legacy and some architecture that requires it reasoning
> behind
> > it.
> >
> > That aside, since it wastes space (a lot of space in some cases) there
> must
> > be a tipping point somewhere. I'm sure one can prove aligned access is
> > faster in a micro-benchmark but I'm not sure it's the case in a DBMS like
> > postgres, specially in the page/rows area.
> >
> > Just for the sake of comparison Mysql COMPACT storage (default and
> > recommended since 5.5) doesn't align data at all. Mysql NDB uses a fixed
> > 4-byte alignment. Not sure about Oracle and others.
> >
> > Is it worth the extra space in newer architectures (specially Intel)?
> > Do you guys think this is something worth looking at?
>
> Yes. At least in my opinion, though, it's not a good project for a
> beginner. If you get your changes to take effect, you'll find that a
> lot of things will break in places that are not easy to find or fix.
> You're getting into really low-level areas of the system that get
> touched infrequently and require a lot of expertise in how things work
> today to adjust.
>

I thought all memory alignment was (or at least the bulk of it) handled
using some codebase wide macros/settings, otherwise how could different
parts of the code inter-op? Poking this area might suffice for some initial
testing to check if it's worth any more attention.

Unaligned memory access received a lot attention in Intel post-Nehalen era.
So it may very well pay off on Intel servers. You might find this blog post
and it's comments/external-links interesting
http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/

I'm a newbie in the codebase, so please let me know if I'm saying anything
non-sense.

> The idea I've had before is to try to reduce the widest alignment we
> ever require from 8 bytes to 4 bytes. That is, look for types with
> typalign = 'd', and rewrite them to have typalign = 'i' by having them
> use two 4-byte loads to load an eight-byte value. In practice, I
> think this would probably save a high percentage of what can be saved,
> because 8-byte alignment implies a maximum of 7 bytes of wasted space,
> while 4-byte alignment implies a maximum of 3 bytes of wasted space.
> And it would probably be pretty cheap, too, because any type with less
> than 8 byte alignment wouldn't be affected at all, and even those
> types that were affected would only be slightly slowed down by doing
> two loads instead of one. In contrast, getting rid of alignment
> requirements completely would save a little more space, but probably
> at the cost of a lot more slowdown: any type with alignment
> requirements would have to fetch the value byte-by-byte instead of
> pulling the whole thing out at once.
>

Does byte-by-byte access stand true nowadays? I though modern processors
would fetch memory at very least in "word" sized chunks, so 4/8 bytes then
merge-slice.

> But there are a couple of obvious problems with this idea, too, such as:
>
> 1. It's really complicated and a ton of work.
>
2. It would break pg_upgrade pretty darn badly unless we employed some
> even-more-complex strategy to mitigate that.
> 3. The savings might not be enough to justify the effort.
>

Very true.

> It might be interesting for someone to develop a tool measuring the
> number of bytes of alignment padding we lose per tuple or per page and
> gather some statistics on it on various databases. That would give us
> some sense as to the possible savings.
>

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mitsumasa KONDO 2014-09-11 13:33:32 Re: [REVIEW] Re: Compression of full-page-writes
Previous Message Robert Haas 2014-09-11 13:31:41 Re: bad estimation together with large work_mem generates terrible slow hash joins