[BUG FIX] Removing NamedLWLockTrancheArray

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [BUG FIX] Removing NamedLWLockTrancheArray
Date: 2017-03-03 08:13:42
Message-ID: 20170303.171342.134582021.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, some of my collegues found that orafce crashes with
postgresql compliled with dtrace.

=== The cause

The immediate cause was I think that it just did
LWLockNewTrancheId and forget to do LWLockRegisterTranche. But
this is hardly detectable by a module developer.

Typical tracepoint looks like the following.

> TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(T_NAME(lock), mode);

Where T_NAME is (LWLockTrancheArray[(lock)->tranche]), as you see
T_NAME assumes that all tranches pointed from lwlocks are
available on the array but the tranches without
LWLockRegisterTranche are not. What is worse this doesn't harm so
much without dtrace (or LWLOCK_STATS or error in LWLockRelease)
. Even if dtrace is activated one or two unregistred tranches
(faultly) have their seat (filled with trash) at the end of the
array with the current code.

As my understanding there are two ways to use lwlocks in
extension. One is using LWLockNewTrancheId and
LWLockRegisterTanche on shared memory provided by the module. The
other is using RequestNamedLWLockTranche in _PG_init and
GetNamedLWLockTranche to acquire locks provided by postgresql.

I couldn't find a documentation about lwlock and trance in
extentions, is there any?

=== How to fix this

The most straightforward way to fix this is chekcing the validity
of tranche id on initilization of a lwlock. Even though I think
that degradation won't be a matter here, NamedLWLockTrancheArray
makes the things very ineffective. After some consideration I
decided to remove NamedLWLockTrancheArray.

=== The patch

The attached patch (is for current master) is aming to fix all
the problem by doing the following two things.

- Remove NameLWLockTrancheArray and all tranches are registered
in LWLockTrancheArray. This seems to work at least for
!EXEC_BAKCEND environment but I haven't tested with EXEC_BACKEND.

- Check tranche id in LWLockInitialize.

The first one required refactoring of CreateLWLocks. It is
changed to register tranches first then initialize lwlokcs.

The problem is found with PG9.6 and this should be backpatched at
least to the version. I haven't tested PG9.5 and 9.4 but it seems
to need different amendment. 9.3 doesn't has tranche.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Remove-NamedLWLockTrancheArray.patch text/x-patch 6.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-03-03 08:18:26 Re: \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Previous Message Amit Langote 2017-03-03 07:51:51 Re: Documentation improvements for partitioning