Re: Move PinBuffer and UnpinBuffer to atomics

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, YUriy Zhuravlev <u(dot)zhuravlev(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Move PinBuffer and UnpinBuffer to atomics
Date: 2016-03-31 17:21:02
Message-ID: CAPpHfdsd=xR=B2jD5rCbiGnmbTrGHFZAUVMS=r_bPezDChbsKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 31, 2016 at 7:14 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> > +/*
> > + * The following two macros are aimed to simplify buffer state
> modification
> > + * in CAS loop. It's assumed that variable "uint32 state" is defined
> outside
> > + * of this loop. It should be used as following:
> > + *
> > + * BEGIN_BUFSTATE_CAS_LOOP(bufHdr);
> > + * modifications of state variable;
> > + * END_BUFSTATE_CAS_LOOP(bufHdr);
> > + *
> > + * For local buffers, these macros shouldn't be used.. Since there is
> > + * no cuncurrency, local buffer state could be chaged directly by atomic
> > + * read/write operations.
> > + */
> > +#define BEGIN_BUFSTATE_CAS_LOOP(bufHdr) \
> > + do { \
> > + SpinDelayStatus delayStatus =
> init_spin_delay((Pointer)(bufHdr)); \
>
> > + uint32 oldstate; \
> > + oldstate = pg_atomic_read_u32(&(bufHdr)->state); \
> > + for (;;) { \
> > + while (oldstate & BM_LOCKED) \
> > + { \
> > + make_spin_delay(&delayStatus); \
> > + oldstate =
> pg_atomic_read_u32(&(bufHdr)->state); \
> > + } \
> > + state = oldstate
> > +
> > +#define END_BUFSTATE_CAS_LOOP(bufHdr) \
> > + if (pg_atomic_compare_exchange_u32(&bufHdr->state,
> &oldstate, state)) \
> > + break; \
> > + } \
> > + finish_spin_delay(&delayStatus); \
> > + } while (0)
>
> Hm. Not sure if that's not too much magic. Will think about it.
>

I'm not sure too...

> > /*
> > + * Lock buffer header - set BM_LOCKED in buffer state.
> > + */
> > + uint32
> > + LockBufHdr(BufferDesc *desc)
> > + {
> > + uint32 state;
> > +
> > + BEGIN_BUFSTATE_CAS_LOOP(desc);
> > + state |= BM_LOCKED;
> > + END_BUFSTATE_CAS_LOOP(desc);
> > +
> > + return state;
> > + }
> > +
>
> Hm. It seems a bit over the top to do the full round here. How about
>
> uint32 oldstate;
> while (true)
> {
> oldstate = pg_atomic_fetch_or(..., BM_LOCKED);
> if (!(oldstate & BM_LOCKED)
> break;
> perform_spin_delay();
> }
> return oldstate | BM_LOCKED;
>
> Especially if we implement atomic xor on x86, that'd likely be a good
> bit more efficient.
>

Nice idea. Done.

> > typedef struct BufferDesc
> > {
> > BufferTag tag; /* ID of page contained in
> buffer */
> > - BufFlags flags; /* see bit definitions
> above */
> > - uint8 usage_count; /* usage counter for clock sweep
> code */
> > - slock_t buf_hdr_lock; /* protects a subset of fields,
> see above */
> > - unsigned refcount; /* # of backends holding
> pins on buffer */
> > - int wait_backend_pid; /* backend
> PID of pin-count waiter */
> >
> > + /* state of the tag, containing flags, refcount and usagecount */
> > + pg_atomic_uint32 state;
> > +
> > + int wait_backend_pid; /* backend
> PID of pin-count waiter */
> > int buf_id; /* buffer's index
> number (from 0) */
> > int freeNext; /* link in
> freelist chain */
>
> Hm. Won't matter on most platforms, but it seems like a good idea to
> move to an 8 byte aligned boundary. Move buf_id up?
>

I think so. Done.

> > +/*
> > + * Support for spin delay which could be useful in other places where
> > + * spinlock-like procedures take place.
> > + */
> > +typedef struct
> > +{
> > + int spins;
> > + int delays;
> > + int cur_delay;
> > + Pointer ptr;
> > + const char *file;
> > + int line;
> > +} SpinDelayStatus;
> > +
> > +#define init_spin_delay(ptr) {0, 0, 0, (ptr), __FILE__, __LINE__}
> > +
> > +void make_spin_delay(SpinDelayStatus *status);
> > +void finish_spin_delay(SpinDelayStatus *status);
> > +
> > #endif /* S_LOCK_H */
>
> s/make_spin_delay/perform_spin_delay/? The former sounds like it's
> allocating something or such.
>

Done.

I think these changes worth running benchmark again. I'm going to run it
on 4x18 Intel.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
pinunpin-cas-9.patch application/octet-stream 82.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Nullmeier 2016-03-31 17:21:12 Re: WIP: Access method extendability
Previous Message Alexander Ostrow 2016-03-31 17:19:44 syntax sugar for conditional check