Re: make check hang on AIX 5L p690 4way/I have two solutions

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tomoyuki Niijima <NIIJIMA(at)jp(dot)ibm(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: make check hang on AIX 5L p690 4way/I have two solutions
Date: 2002-09-02 04:41:42
Message-ID: 200209020441.g824fhp29551@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


I have applied the following patch to PostgreSQL CVS. If there are AIX
portability issues, they will show up during beta testing. Thanks for
the fix. I have heard of other AIX folks with similar problems.

---------------------------------------------------------------------------

Tomoyuki Niijima wrote:
> Your name : Tomoyuki Niijima
> Your email address : niijima(at)jp(dot)ibm(dot)com
>
>
> System Configuration
> ---------------------
> Architecture (example: Intel Pentium) : IBM 7040-681 (pSeries
> 690) 4way (LPAR)
>
> Operating System (example: Linux 2.0.26 ELF) : AIX 5L 5.1
>
> PostgreSQL version (example: PostgreSQL-7.2.1): PostgreSQL-7.2.1
>
> Compiler used (example: gcc 2.95.2) : gcc 2.9
>
>
> Please enter a FULL description of your problem:
> ------------------------------------------------
> I tried to build PostgreSQL with the following step to see backends hung
> during the regression test. The problem has been reproduced on two machine
> but both of these are the same type of hardware and software. I also tried
> to recreate the problem on other machines, on older version of AIX but I
> couldn't.
>
>
> Please describe a way to repeat the problem. Please try to provide a
> concise reproducible example, if at all possible:
> ----------------------------------------------------------------------
> ./configure --enable-multibyte=EUC_JP --with-CC=gcc
> make
>
> I learned that backend slept in semop() by attaching dbx (AIX debugger) to
> one of 'postgres:' processes.
>
>
>
> If you know how this problem might be fixed, list the solution below:
> ---------------------------------------------------------------------
> After looked through pgsql-hackers mailing list, I focused on spin lock
> issue to solve the problem. The easiest and may not be the best solution
> for the problem is to give up HAS_TEST_AND_SET. This actually works.
>
> *** src/include/port/aix.h.org Tue Feb 13 23:32:52 2001
> --- src/include/port/aix.h Fri Aug 30 01:02:28 2002
> ***************
> *** 1,8 ****
> #define CLASS_CONFLICT
> #define DISABLE_XOPEN_NLS
> ! #define HAS_TEST_AND_SET
> #define NO_MKTIME_BEFORE_1970
> ! typedef unsigned int slock_t;
>
> #include <sys/machine.h> /* ENDIAN definitions for network
> *
> communication
> */
> --- 1,8 ----
> #define CLASS_CONFLICT
> #define DISABLE_XOPEN_NLS
> ! /* #define HAS_TEST_AND_SET */
> #define NO_MKTIME_BEFORE_1970
> ! /* typedef unsigned int slock_t; */
>
> #include <sys/machine.h> /* ENDIAN definitions for network
> *
> communication
> */
>
>
> One another and better solution for the problem is to use _check_lock() and
> _clear_lock() as spin lock. Important thing here is to define S_UNLOCK()
> with _clear_lock(). This will solve the so called "Compiler bug" issue
> someone wrote on the mailing list.
>
> We have some other API such as cs(), compare_and_swap() and fetch_and_or()
> to do test and set on AIX, but any of these didn't solve my problem. I
> wrote tiny testing program to see if we have any bug of these API of AIX,
> but I couldn't see any problem except for compare_and_swap(). It seems that
> you can not use compare_and_swap() for the purpose, as it would not work as
> spin lock on any SMP machines I tested. I don't know the reason why cs()
> nor fetch_and_or()/fetch_and_and() will not work with PostgreSQL on p690.
> These worked with my testing program on all machines I tested.
>
> *** ./src/include/storage/s_lock.h.org Fri Aug 30 01:13:15 2002
> --- ./src/include/storage/s_lock.h Wed Jan 30 00:44:42 2002
> ***************
> *** 440,447 ****
> * Note that slock_t on POWER/POWER2/PowerPC is int instead of char
> * (see storage/ipc.h).
> */
> ! #define TAS(lock) _check_lock(lock, 0, 1)
> ! #define S_UNLOCK(lock) _clear_lock(lock, 0)
> #endif /* _AIX */
>
>
> --- 440,446 ----
> * Note that slock_t on POWER/POWER2/PowerPC is int instead of char
> * (see storage/ipc.h).
> */
> ! #define TAS(lock) cs((int *) (lock), 0, 1)
> #endif /* _AIX */
>
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Attachment Content-Type Size
unknown_filename text/plain 747 bytes

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2002-09-02 04:41:51 Re: failed to build libpq.so on AIX 4 and 5/I have a solution
Previous Message Bruce Momjian 2002-09-02 04:37:10 Re: Minor regression test fix