LDAP authentication fails with concurrent create extensions

From: Greg k <gregg(dot)kay(at)gmail(dot)com>
To: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: LDAP authentication fails with concurrent create extensions
Date: 2018-04-10 07:37:18
Message-ID: CAC5zpv3PZVZmSTj+ugpC54mC5uK+hRbN6w3f6gmFGR10phGGsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

We are using postgresql 10.3 on Centos 7.2 with LDAP authentication (samba4
with AD domain controller). We've recently moved to LDAP authentication and
are now encountering a problem where some concurrent connections that
create extensions in different databases at the same time are failing with
a "Can't contact LDAP server" error. The postgres error log contains:

========================================================================================
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [9-1]
db=db_test_022,user=fullaccess LOG: 00000: LDAP login failed for user
"cn=dev_fullaccess,
OU=StandardUsers,OU=Postgres,OU=ZZ,DC=company,DC=com,DC=XX" on server "
server1.ZZ.net server2.ZZ.net": Can't contact LDAP server
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [9-2]
db=db_test_022,user=fullaccess LOCATION: CheckLDAPAuth, auth.c:2593
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [10-1]
db=db_test_022,user=fullaccess FATAL: 28000: LDAP authentication failed
for user "fullaccess"
Apr 10 16:55:27.989 kvmdevdb12 postgres[10901]: [10-2]
db=db_test_022,user=fullaccess DETAIL: Connection matched pg_hba.conf line
12: "host all
+systemlogin 0.0.0.0/0 ldap ldapserver="
server1.ZZ.net server2.ZZ.net" ldapprefix="cn=dev_"
ldapsuffix=",OU=StandardUsers,OU=Postgres,OU=ZZ,DC=company,DC=com,DC=XX""
Apr 10 16:55:27.989 kvmdevdb12 postgres[10901]: [10-3]
db=db_test_022,user=fullaccess LOCATION: auth_failed, auth.c:328
========================================================================================

The postgres version is "PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled
by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit" although we get
similar failures with Postgres 9.5.9.

The problem is reproducible with this bash script and assumes the user
password is in .pgpass:

========================================================================================
#!/bin/bash

set -u

declare -r NUM_DATABASES=50
declare -r NUM_CREATE_EXTENSIONS=1

declare -r DBHOST=kvmdevdb12
declare -r PORT=5437
declare -r USER=fullaccess

declare -r EXTENSION_NAME="btree_gin"
#declare -r EXTENSION_NAME="pgcrypto"
#declare -r EXTENSION_NAME="pg_buffercache"
#declare -r EXTENSION_NAME="tablefunc"

declare -r LOG=/tmp/drop_create.log
declare -r SIGNAL_FILE=/tmp/start_drop_create.flg
declare -r ERROR_SIGNAL_FILE=/tmp/connect_error

function get_now()
{
date +'%Y-%m-%d %H:%M:%S.%N'
}

function create_extension()
{
local -r DBNAME=$1

psql -qt -p $PORT -h $DBHOST -d $DBNAME -U $USER -c "DROP EXTENSION IF
EXISTS $EXTENSION_NAME; CREATE EXTENSION $EXTENSION_NAME SCHEMA extensions"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create extension
\"$EXTENSION_NAME\" in database \"$DBNAME\" for user \"$USER\"." | tee -a
$LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi

}

function drop_create_db()
{
local -r DBNAME=$1

psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c "DROP DATABASE IF
EXISTS $DBNAME"
psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c "CREATE DATABASE
$DBNAME"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create database
\"$DBNAME\"." | tee -a $LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi
psql -qt -p $PORT -h $DBHOST -d $DBNAME -U $USER -c "CREATE SCHEMA
extensions"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create schema for
\"$DBNAME\"." | tee -a $LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi

while [[ ! -f $SIGNAL_FILE ]]; do
:
done;

for i in $(seq 1 $NUM_CREATE_EXTENSIONS); do
create_extension $DBNAME
done
}

echo "$(get_now): Postgres version: $(psql -qt -p $PORT -h $DBHOST -d
postgres -U $USER -c 'SELECT VERSION()')"
echo

echo "$(get_now): Setting up $NUM_DATABASES background jobs..."

rm -f $SIGNAL_FILE
rm -f $ERROR_SIGNAL_FILE

declare DBNAME=""
declare NUM=1
while [[ 1 ]]; do
echo "$(get_now): Creating \"$DBHOST:$PORT/$DBNAME\"..." >> $LOG
DBNAME=$(printf "db_test_%03d" $NUM)
drop_create_db $DBNAME &
if [[ $NUM -ge $NUM_DATABASES ]]; then
break
fi
NUM=$(($NUM + 1))
echo >> $LOG
done

echo "$(get_now): Signal the $NUM_DATABASES background jobs to start
creating extensions..."
touch $SIGNAL_FILE

echo "$(get_now): Waiting for all the create extensions to finish..."
echo
wait
echo

if [[ -f $ERROR_SIGNAL_FILE ]]; then
echo "$(get_now): THERE WERE ERRORS."
echo
fi
echo "$(get_now): *** Finished." | tee -a $LOG

========================================================================================

I did some simple debugging into 'openldap' and could get this debug
information about the failure:

========================================================================================
ldap_init: trying /var/lib/pgsql/ldap/etc/openldap/ldap.conf
ldap_init: HOME env is /var/lib/pgsql
ldap_init: trying /var/lib/pgsql/ldaprc
ldap_init: trying /var/lib/pgsql/.ldaprc
ldap_init: trying ldaprc
ldap_init: LDAPCONF env is NULL
ldap_init: LDAPRC env is NULL
ldap_create
ldap_simple_bind_s
ldap_sasl_bind_s
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP server1.om.net:389
ldap_new_socket: 6
ldap_prepare_socket: 6
ldap_connect_to_host: Trying 192.168.111.111:389
ldap_pvt_connect: fd: 6 tm: -1 async: 0
attempting to connect:
connect success
ldap_open_defconn: successful
ldap_send_server_request
ldap_result ld 0x1d76380 msgid 1
wait4msg ld 0x1d76380 msgid 1 (infinite timeout)
wait4msg continue ld 0x1d76380 msgid 1 all 1
** ld 0x1d76380 Connections:
* host: server1.om.net port: 389 (default)
refcnt: 2 status: Connected
last used: Tue Apr 10 15:01:01 2018

** ld 0x1d76380 Outstanding Requests:
* msgid 1, origid 1, status InProgress
outstanding referrals 0, parent count 0
ld 0x1d76380 request count 1 (abandoned 0)
** ld 0x1d76380 Response Queue:
Empty
ld 0x1d76380 response count 0
ldap_chkResponseList ld 0x1d76380 msgid 1 all 1
ldap_chkResponseList returns ld 0x1d76380 NULL
ldap_int_select
ldap_int_select returned -1: errno 4
<====================== ERROR HERE
ldap_unbind
ldap_free_request (origid 1, msgid 1)
ldap_free_connection 1 1
ldap_send_unbind
ldap_free_connection: actually freed
ldap_err2string
========================================================================================

errno 4 is EINTR and the function actually failing is 'poll' inside the
os-ip.c function 'ldap_int_select'.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Huong Dangminh 2018-04-10 08:30:19 power() function in Windows: "value out of range: underflow"
Previous Message PG Bug reporting form 2018-04-09 19:15:52 BUG #15148: font is too small to see