RE: Assuming that TAS() will succeed the first time is verboten

From: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: RE: Assuming that TAS() will succeed the first time is verboten
Date: 2000-12-28 21:56:39
Message-ID: 8F4C99C66D04D4118F580090272A7A234D321D@sectorbase1.sectorbase.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The bottom line turns out to be that on the Alpha hardware, it is
> possible for TAS() to fail even when the lock is initially zero,
> because that hardware's locking protocol will fail to acquire the
> lock if the ldq_l/stq_c sequence is interrupted. TAS() *must* be
> called in a retry loop on Alphas. Thus, the coding presently in
> xlog.c,
>
> while (TAS(&(XLogCtl->chkp_lck)))
> {
> struct timeval delay = {2, 0};
>
> if (shutdown)
> elog(STOP, "Checkpoint lock is busy while data
> base is shutting down");
> (void) select(0, NULL, NULL, NULL, &delay);
> }
>
> is no good because it does not allow for multiple retries.
>
> Offhand I see no good reason why the above-quoted code isn't just
>
> S_LOCK(&(XLogCtl->chkp_lck));
>
> and propose to fix this problem by reducing it to that. If the lock
> is held when it shouldn't be, we'll fail with a stuck-spinlock error.

It was not test against stuck slock: postmaster can shutdown db only
when there is no backends running and this was just additional test
for that. Assuming that postmaster does right things I don't object
to get rid of this test, but I would just remove shutdown test itself
- checkpoints run long time and 2 sec timeout is better than ones
in s_lock_sleep();

> It also bothers me that xlog.c contains several places where
> there is a potentially infinite wait for a lock. It seems to me
> that these should time out with stuck-spinlock messages. Do you object
> to such a change?

I always considered time-based decision is slock stuck or not as hack -
in old times elog(ERROR/FATAL) didn't unlock slocks sometimes, isn't
it fixed now?

Anyway I don't object if it bothers you -:)
But please do not use S_LOCK - as you see WAL code try to do other
things if slock of "primary interest" is busy.

Vadim

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-12-28 22:06:55 Re: Assuming that TAS() will succeed the first time is verboten
Previous Message Nathan Myers 2000-12-28 21:47:44 Re: Assuming that TAS() will succeed the first time is verboten