Re: Ident authentication fails due to bind error on server (8.4.8)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Marinos Yannikos" <mjy(at)geizhals(dot)at>
Cc: "PostgreSQL Bugs" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Ident authentication fails due to bind error on server (8.4.8)
Date: 2011-06-17 15:41:21
Message-ID: 17691.1308325281@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

"Marinos Yannikos" <mjy(at)geizhals(dot)at> writes:
> I'm not sure that this is not a configuration or networking issue (so
> apologies if it is), but we seem to be getting rare (a few times/day)
> failures with ident authentication because several clients attempt to do
> it simultaneously over a high-latency connection (capitalized = edited
> IPs/username etc.):

> [DB CLIENTADDR(51985) 3173 2011-06-17 10:49:56 CEST] LOG: could not bind
> to local address "SERVERADDR": Address already in use
> [DB CLIENTADDR(51985) 3173 2011-06-17 10:49:56 CEST] FATAL: Ident
> authentication failed for user "USER"

Hm. What platform is this on?

> Is this a possible race condition in src/backend/libpq/auth.c ?

I don't think it's a race condition per se. The code ought to be
setting up the address argument for bind() with sin_port = 0 so that
an unused port number gets assigned. That seems to be what happens on
a couple of machines that I tried here, but I notice that the Linux
manpage for getaddrinfo says

service sets the port in each returned address structure. If
this argument is a service name (see services(5)), it is
translated to the corresponding port number. This argument can
also be specified as a decimal number, which is simply converted
to binary. If service is NULL, then the port number of the
returned socket addresses will be left uninitialized.

In principle this wording would allow getaddrinfo to return the same
nonzero port number in multiple backends, which would lead to the
reported failure if they were doing ident verification at the same time.
I'm thinking maybe we should explicitly pass "0" rather than NULL to
getaddrinfo here. On the other hand, it seems to work reliably as-is
on my Linux machine, so this is just speculation at this point.

(BTW, is it really sane to be using ident auth over a "high latency
connection"? That would certainly suggest to me that you could be
getting connections from untrustworthy machines ...)

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kevin Grittner 2011-06-17 15:56:52 Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes
Previous Message Tom Lane 2011-06-17 14:28:06 Re: BUG #6065: FATAL: lock 0 not held