Re: Atomic operations within spinlocks

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Atomic operations within spinlocks
Date: 2020-06-06 02:31:03
Message-ID: 20200606023103.avzrctgv7476xj7i@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-06-05 21:01:56 -0400, Tom Lane wrote:
> > I'm not really sure what to do about that issue. The easisest thing
> > would probably be to change the barrier generation to 32bit (which
> > doesn't have to use locks for reads in any situation).
>
> Yeah, I think we need a hard rule that you can't use a spinlock in
> an interrupt handler --- which means no atomics that don't have
> non-spinlock implementations on every platform.

Yea, that might be the easiest thing to do. The only other thing I can
think of would be to mask all signals for the duration of the
atomic-using-spinlock operation. That'd make the fallback noticably more
expensive, but otoh, do we care enough?

I think a SIGNAL_HANDLER_BEGIN(); SIGNAL_HANDLER_END(); to back an
Assert(!InSignalHandler()); could be quite useful. Could also save
errno etc.

> At some point I think we'll have to give up --disable-spinlocks; it's
> really of pretty marginal use (how often does anyone port PG to a new
> CPU type?) and the number of weird interactions it adds in this area
> seems like more than it's worth.

Indeed. And any new architecture one would port PG to would have good
enough compiler intrinsics to make that trivial. I still think it'd make
sense to have a fallback implementation using compiler intrinsics...

And I think we should just require 32bit atomics at the same time. Would
probably kill gaur though.

I did just find a longstanding bug in the spinlock emulation code:

void
s_init_lock_sema(volatile slock_t *lock, bool nested)
{
static int counter = 0;

*lock = ((++counter) % NUM_SPINLOCK_SEMAPHORES) + 1;
}

void
s_unlock_sema(volatile slock_t *lock)
{
int lockndx = *lock;

if (lockndx <= 0 || lockndx > NUM_SPINLOCK_SEMAPHORES)
elog(ERROR, "invalid spinlock number: %d", lockndx);
PGSemaphoreUnlock(SpinlockSemaArray[lockndx - 1]);
}

I don't think it's ok that counter is a signed integer... While it maybe
used to be unlikely that we ever have that many spinlocks, I don't think
it's that hard anymore, because we dynamically allocate them for a lot
of parallel query stuff. A small regression test that initializes
enough spinlocks indeed errors out with
2020-06-05 18:08:29.110 PDT [734946][3/2:0] ERROR: invalid spinlock number: -126
2020-06-05 18:08:29.110 PDT [734946][3/2:0] STATEMENT: SELECT test_atomic_ops();

> > Randomly noticed while looking at the code:
> > uint64 flagbit = UINT64CONST(1) << (uint64) type;
>
> I'm surprised we didn't get any compiler warnings about that.

Unfortunately I don't think one can currently compile postgres with
warnings for "implicit casts" enabled :(.

> But of course requiring 64-bit atomics is still a step too far.

If we had a 32bit compare-exchange it ought to be possible to write a
signal-safe emulation of 64bit atomics. I think. Something *roughly*
like:

typedef struct pg_atomic_uint64
{
/*
* Meaning of state bits:
* 0-1: current valid
* 2-4: current proposed
* 5: in signal handler
* 6-31: pid of proposer
*/
pg_atomic_uint32 state;

/*
* One current value, two different proposed values.
*/
uint64 value[3];
} pg_atomic_uint64;

The update protocol would be something roughly like:

bool
pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 *expected, uint64 newval)
{
while (true)
{
uint32 old_state = pg_atomic_read_u32(&ptr->state);
uint32 updater_pid = PID_FROM_STATE(old_state);
uint32 new_state;
uint64 old_value;

int proposing;

/*
* Value changed, so fail. This is obviously racy, but we'll
* notice concurrent updates later.
*/
if (ptr->value[VALID_FIELD(old_state)] != *expected)
{
return false;
}

if (updater_pid == INVALID_PID)
{

new_state = old_state;

/* signal that current process is updating */
new_state |= MyProcPid >> PID_STATE_SHIFT;
if (InSignalHandler)
new_state |= PROPOSER_IN_SIGNAL_HANDLER_BIT;

/* set which index is being proposed */
new_state = (new_state & ~PROPOSER_BITS) |
NEXT_PROPOSED_FIELD(old_state, &proposing);

/*
* If we successfully can update state to contain our new
* value, we have a right to do so, and can only be
* interrupted by ourselves, in a signal handler.
*/
if (!pg_atomic_compare_exchange(&ptr->state, &old_state, new_state))
{
/* somebody else updated, restart */
continue;
}

old_state = new_state;

/*
* It's ok to compare the values now. If we are interrupted
* by a signal handler, we'll notice when updating
* state. There's no danger updating the same proposed value
* in two processes, because they they always would get
* offsets to propse into.
*/
ptr->value[proposing] = newval;

/* set the valid field to the one we just filled in */
new_state = (new_state & ~VALID_FIELD_BITS) | proposed;
/* remove ourselve as updater */
new_state &= UPDATER_BITS;

if (!pg_atomic_compare_exchange(&ptr->state, &old_state, new_state))
{
/*
* Should only happen when we were interrupted by this
* processes' handler.
*/
Assert(!InSignalHandler);

/*
* Signal handler must have cleaned out pid as updater.
*/
Assert(PID_FROM_STATE(old_state) != MyProcPid);
continue;
}
else
{
return true;
}
}
else if (PID_FROM_STATE(current_state) == MyProcPid)
{
/*
* This should only happen when in a signal handler. We don't
* currently allow nesting of signal handlers.
*/
Assert(!(current_state & PROPOSER_IN_SIGNAL_HANDLER_BIT));

/* interrupt our own non-signal-handler update */
new_state = old_state | PROPOSER_IN_SIGNAL_HANDLER_BIT;

/* set which index is being proposed */
new_state = (new_state & ~PROPOSER_BITS) |
NEXT_PROPOSED_FIELD(old_state, &proposing);

// FIXME: assert that previous value still was what we assumed
pg_atomic_exchange_u32(&ptr_state.state, new_state);
}
else
{
do
{
pg_spin_delay();

current_state = pg_atomic_read_u32(&ptr->state);
} while (PID_FROM_STATE(current_state) != INVALID_PID)
}
}
}

While that's not trivial, it'd not be that expensive. The happy path
would be two 32bit atomic operations to simulate a 64bit one.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-06-06 02:45:01 Re: v13: Performance regression related to FORTIFY_SOURCE
Previous Message Jeff Davis 2020-06-06 02:06:50 Re: v13: Performance regression related to FORTIFY_SOURCE