Re: LDAP check flapping on crake due to race

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LDAP check flapping on crake due to race
Date: 2020-08-02 16:09:25
Message-ID: 2134966.1596384565@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Sun, Aug 02, 2020 at 05:29:57PM +1200, Thomas Munro wrote:
>> There are one or two failures per month on crake. It looks like when
>> authentication is rejected, as expected in the tests, the psql process
>> is exiting, but there is a race where the Perl script still wants to
>> write a dummy query to its stdin (?), so you get:
>> psql: FATAL: LDAP authentication failed for user "test1"
>> ack Broken pipe: write( 13, 'SELECT 1' ) at
>> /usr/share/perl5/vendor_perl/IPC/Run/IO.pm line 549.

> Do you suppose a fix like e12a472 would cover this? ("psql <&-" fails with
> status 1 after successful authentication, and authentication failure gives
> status 2.)

AFAICT the failure is happening down inside PostgresNode::psql's call
of IPC::Run::run, so we don't really have the option to adjust things
in exactly that way. (And messing with sub psql for the benefit of
this one caller seems pretty risky anyway.)

I'm inclined to suggest that the LDAP test's test_access could use
an empty stdin and pass "-c 'SELECT 1'" as a command line option
instead. (Maybe that's exactly what you meant, but I'm not sure.)

I've not been able to duplicate this locally, so I have no idea if
that'd really fix it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-08-02 17:37:39 Removing <@ from contrib/intarray's GiST opclasses
Previous Message Michail Nikolaev 2020-08-02 16:07:33 Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM