| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
| Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: PGPROC alignment (was Re: pgsql: Separate RecoveryConflictReasons from procsignals) |
| Date: | 2026-02-10 18:15:01 |
| Message-ID: | aYtsrQ-kmA8nqeiH@alap3.anarazel.de |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-committers pgsql-hackers |
Hi,
On 2026-02-10 19:14:44 +0200, Heikki Linnakangas wrote:
> On 10/02/2026 18:41, Andres Freund wrote:
> > On 2026-02-10 17:52:16 +0200, Heikki Linnakangas wrote:
> > > If there's a performance reason to keep have it be aligned - and maybe there
> > > is - we should pad it explicitly.
> >
> > We should make it a power of two or such. There are some workloads where the
> > indexing from GetPGProcByNumber() shows up, because it ends up having to be
> > implemented as a 64 bit multiplication, which has a reasonably high latency
> > (3-5 cycles). Whereas a shift has a latency of 1 and typically higher
> > throughput too.
>
> Power of two means going to 1024 bytes. That's a lot of padding. Where have
> you seen that show up?
LWLock contention heavy code, due to the GetPGProcByNumber() in LWLockWakeup()
and other similar places.
> Attached is a patch to align to cache line boundary. That's straightforward
> if that's what we want to do.
Yea, I think we should do that. Even if we don't see a difference today, just
because it's a hell to find production issues around this, because it is so
dependent on what processes use which PGPROC etc and because false sharing
issues are generally expensive to debug.
> > Re false sharing: We should really separate stuff that changes (like
> > e.g. pendingRecoveryConflicts) and never changing stuff (backendType). You
> > don't need overlapping structs to have false sharing issues if you mix
> > different access patterns inside a struct that's accessed across processes...
>
> Makes sense, although I don't want to optimize too hard for performance, at
> the expense of readability. The current order is pretty random anyway,
> though.
Yea, I don't think we need to be perfect here. Just a bit less bad. And, as
you say, the current order doesn't make a lot of sense.
Just grouping things like
- pid, pgxactoff, backendType (i.e. barely if ever changing)
- wait_event_info, waitStart (i.e. very frequently changing, but typically
accessed within one proc)
- sem, lwWaiting, waitLockMode (i.e. stuff that is updated frequently and
accessed across processes)
> It'd probably be good to move the subxids cache to the end of the struct.
> That'd act as natural padding, as it's not very frequently used, especially
> the tail end of the cache.
Yea, that'd make sense.
> Or come to think of it, it might be good to move the subxids cache out of
> PGPROC altogether. It's mostly frequently accessed in GetSnapshotData(), and
> for that it'd actually be better if it was in a separate "mirrored" array,
> similar to the main xid and subxidStates. That would eliminate the
> pgprocnos[pgxactoff] lookup from GetSnapshotdata() altogether.
I doubt it's worth it - that way we'd need to move a lot larger array around
during [dis]connect. The subxids stuff is a lot larger than the xid,
statusFlags arrays...
> I'm a little reluctant to mess with this without a concrete benchmark
> though. Got one in mind?
I'm travelling this week, but I'll try to recreate the benchmarks I've seen
this on. Unfortunately you really need a large machine to really see
differences, without that the memory latency between cores is just too low to
realistically see issues.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bertrand Drouvot | 2026-02-10 19:15:27 | Re: PGPROC alignment (was Re: pgsql: Separate RecoveryConflictReasons from procsignals) |
| Previous Message | Heikki Linnakangas | 2026-02-10 17:14:44 | PGPROC alignment (was Re: pgsql: Separate RecoveryConflictReasons from procsignals) |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexandre Felipe | 2026-02-10 18:19:28 | Re: [WIP]Vertical Clustered Index (columnar store extension) - take2 |
| Previous Message | Alexander Lakhin | 2026-02-10 18:00:00 | Re: Instability in postgres_fdw regression tests |