Re: Connection problem under extreme load.

From: Jeffery Collins <collins(at)onyx-technologies(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, "Andrew A(dot) Burtnett" <aburtnett(at)onyx-technologies(dot)com>
Subject: Re: Connection problem under extreme load.
Date: 2000-07-28 12:57:21
Message-ID: 39818331.653B0DD9@onyx-technologies.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane wrote:

> Interesting. I *think* (not totally sure) that 'Connection refused'
> here implies that the kernel rejected the connection before the
> postmaster ever had a chance to do anything with it. The most likely
> reason would probably be that the maximum connection backlog was
> exceeded. On my system (HPUX) man listen(2) sez
>
> int listen(int s, int backlog);
>
> ...
>
> backlog defines the desirable queue length for pending connections.
> The actual queue length may be greater than the specified backlog . If
> a connection request arrives when the queue is full, the client will
> receive an ETIMEDOUT error.
>
> backlog is limited to the range of 0 to SOMAXCONN, which is defined in
> <sys/socket.h>. SOMAXCONN is currently set to 20. If any other value
> is specified, the system automatically assigns the closest value
> within the range. A backlog of 0 specifies only 1 pending connection
> is allowed at any given time.
>
> ETIMEDOUT is not the error you are getting, but that could be a platform
> difference. In fact the nearest BSD system I have access to says that
> "the client will receive an error with an indication of ECONNREFUSED".
> The same box defines SOMAXCONN as 5, which seems a tad low :-(
>
> So, it would seem your options are
> (a) recompile your kernel with larger SOMAXCONN, or
> (b) figure out why the postmaster isn't responding faster.
>
> Offhand, the only performance problem I know of in the postmaster is
> that it does IDENT checks serially --- if you specify ident checks in
> pg_hba.conf, the postmaster will wait for a response from the ident
> server before processing more connection requests. So if you're using
> IDENT authentication you might want to consider some other answer, or
> else fix that code and send in a patch.
>
> If that's not it, please poke into it further and let us know what you
> find out.
>
> regards, tom lane

I think you are correct. The listen man page on my machine (Sun Solaris)
says:

If a connection request arrives with the queue full, the client will
receive
an error with an indication of ECONNREFUSED...

The SOMAXCONN field is also 5, which IS a tad low.

Unfortunately, I don't have the ability to rebuild the kernel so this is not
an option.

As to why the postmaster was not responding faster, I think it was because of
the load on the machine. The load was so heavy, and there were so many
connection requests at the same time, I am not surprised that it could not
keep up. My test was probably not a realistic load.

I think my best option is to retry the connection when this happens. I do
wish my kernel would return a different failure, because there really is no
way to distinguish a legitimate ECONNREFUSED (i.e. the server really isn't
listening), versus a backlog queue full situation.

Once again, thank you very much,
Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Sullivan 2000-07-28 13:10:51 Re: Re: pg_dump problem
Previous Message Mitch Vincent 2000-07-28 12:43:29 Re: Re: 4 billion record limit?