Re: Should we cacheline align PGXACT?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Bernd Helmle <mailings(at)oopsware(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Should we cacheline align PGXACT?
Date: 2017-02-15 19:15:27
Message-ID: 20170215191527.a2b6dbdakomrwgkh@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2017-02-15 11:43:17 -0500, Robert Haas wrote:
> On Mon, Feb 13, 2017 at 11:07 AM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > It seems to me that Andres comments here were largely ignored:
> > https://www.postgresql.org/message-id/20160822021747.u5bqx2xwwjzac5u5@alap3.anarazel.de
> > He was suggesting to increase the struct size to 16 bytes rather than
> > going all the way up to 128. Did anybody test this?
>
> So, I think that going up to 128 bytes can't really make sense. If
> that's the best-performing solution here, then maybe what we ought to
> be doing is reverting the PGXACT/PGPROC separation, sticking these
> critical members at the beginning, and padding the whole PGXACT out to
> a multiple of the cache line size. Something that only touches the
> PGXACT fields will touch the exact same number of cache lines with
> that design, but cases that touch more than just the PGXACT fields
> should perform better.

I don't think that's true for several reasons. Separating out PGXACT
didn't just mean reducing the stride size of the access / preventing
sharing. It also meant that frequently changing fields in PGPROC aren't
on the same cache-line as fields in PGXACT. That makes quite a
difference, because with the split a lot of the cachelines "backing"
PGPROC can stay in 'shared' mode in several CPU caches, while
modifications to PGPROC largely can stay in 'exclusive' mode locally on
the CPU the backend is currently running on. I think I previously
mentioned, even just removing the MyPgXact->xmin assignment in
SnapshotResetXmin() is measurable performance wise and cache-hit ratio
wise.

While Nasby is right that going to multiple of the actual line size can
cause problems due to the N-way associativity of caches, it's not like
unused padding memory >= cacheline-size actually causes cache to be
wasted; it just gets used for other things.

FWIW, I however think we should just set the cacheline size based on the
architecture, rather than defaulting to 128 - it's not like that
actually changes particularly frequently. I doubt a special-case for
each of x86, arm, power, itanium, and the rest will cause us a lot of
maintenance issues.

> The point of moving PGXACT into a separate
> structure was to reduce the number of cache lines required to build a
> snapshot. If that's going to go back up to 1 per PGPROC anyway, we
> might as well at least use the rest of that cache line to store
> something else that might be useful instead of pad bytes. I think,
> anyway.

I don't think that's true, due to the aforementioned dirtyness issue,
which causes inter-node cache coherency traffic (busier bus & latency).

> I think our now-standard way of doing this is to have a "Padded"
> version that is a union between the unpadded struct and char[]. c.f.
> BufferDescPadded, LWLockPadded.

That's what my original POC patch did IIRC, but somebody complained it
changed a lot of accesses.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-02-15 19:23:20 Re: Should we cacheline align PGXACT?
Previous Message Robert Haas 2017-02-15 19:03:46 Re: [PROPOSAL] Temporal query processing with range types