Re: Skylake-S warning

From: Andres Freund <andres(at)anarazel(dot)de>
To: Daniel Wood <hexexpert(at)comcast(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Skylake-S warning
Date: 2018-10-05 17:29:55
Message-ID: 20181005172955.wyjb4fzcdzqtaxjq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-10-03 17:28:59 -0700, Daniel Wood wrote:
> FYI, be careful with padding PGXACT's to a full cache line.

I'm not actually thinking of doing that, but just to round it up so we
don't have PGXACTs spanning cachelines. It's currently 12bytes, so we
end up with one spanning 60-72, then from 120-132 etc. In my quick
testing - on a larger westmere machine, so quite possibly outdated -
that reduces the variance quite a bit. I think we should make sure it's
16 bytes. What I'd previously suggested is that we use the additional
space for the database oid or such, that'd allow for some future
improvements.

I think we also should:
- make sure the pgprocnos array is cacheline aligned. It's not updated
as frequently, but I still saw some minor benefits. Adding a
pg_attribute_aligned(64) should be sufficient.
- remove the volatiles from GetSnapshotData(). As we've, for quite a
while now, made sure both lwlocks and spinlocks are proper barriers
they're not needed.
- reduce the largely redundant flag tests. With the previous change done
the compiler should be able to do so, but there's no reason to not
start from somewhere sane. I'm kinda wondering about backpatching
this part.

To me this list seems lik efairly straight-forward changes that we
should do soon.

One thing I observed when re-familiarizing myself with this is that
currently the assignment of pgprocnos is a bit weird - we actually don't
start with the lowest pgprocno/PGPROC, but instead start with the
*highest* one. That's because InitProcGlobal() populates the free lists
in a 'highest first' manner:

/*
* Newly created PGPROCs for normal backends, autovacuum and bgworkers
* must be queued up on the appropriate free list. Because there can
* only ever be a small, fixed number of auxiliary processes, no free
* list is used in that case; InitAuxiliaryProcess() instead uses a
* linear search. PGPROCs for prepared transactions are added to a
* free list by TwoPhaseShmemInit().
*/
if (i < MaxConnections)
{
/* PGPROC for normal backend, add to freeProcs list */
procs[i].links.next = (SHM_QUEUE *) ProcGlobal->freeProcs;
ProcGlobal->freeProcs = &procs[i];
procs[i].procgloballist = &ProcGlobal->freeProcs;
}

So we constantly push the current element to the start of the list.

I, and apparently the CPUs I have access too, think that's a poor
idea. It means there's less predictable access patterns at volatile
PGXACT *pgxact = &allPgXact[pgprocno]; in GetSnapshotData(), because
instead of starting with the first backend, we start with something that
changes on a regular basis. And it's not likely to be on a cacheline
boundary. I think we should change that so procono 0 is the first on
->freeProcs. Obviously that's not guaranteed that usage remains that
way, but there's very little reason, except some minor coding advantage
to keep it like it is.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2018-10-05 18:47:11 WIP: Avoid creation of the free space map for small tables
Previous Message Tom Lane 2018-10-05 17:10:32 Re: Performance improvements for src/port/snprintf.c