Re: 8.4-vintage problem in postmaster.c

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 8.4-vintage problem in postmaster.c
Date: 2010-11-15 14:24:31
Message-ID: 1289830820-sup-4322@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Excerpts from Tom Lane's message of sáb nov 13 19:07:50 -0300 2010:
> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> > On 11/13/2010 06:58 PM, Tom Lane wrote:
> >> Just looking at it, I think that the logic in canAcceptConnections got
> >> broken by somebody in 8.4, and then broken some more in 9.0: in some
> >> cases it will return an "okay to proceed" status without having checked
> >> for TOOMANY children. Was this system possibly in PM_WAIT_BACKUP or
> >> PM_HOT_STANDBY state? What version was actually running?
>
> > I don't have too many details on the actual setup (working on that) but
> > the box in question is running 8.4.2 and had no issues before the
> > upgrade to 8.4 (ie 8.3 was reported to work fine - so a 8.4+ breakage
> > looks plausible).
>
> Well, this failure would certainly involve a flood of connection
> attempts, so it's possible it's a pre-existing bug that they just did
> not happen to trip over before. But the sequence of events that I'm
> thinking about is a smart shutdown attempt (SIGTERM to postmaster)
> while an online backup is in progress, followed by a flood of
> near-simultaneous connection attempts while the backup is still active.

As far as I could gather from Stefan's description, I think this is
pretty unlikely. It seems to me that the "too many children" error
message is very common in the 8.3 setup already, and the only reason
they have a problem on 8.4 is that it crashes instead.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2010-11-15 14:34:38 Re: duplicate connection failure messages
Previous Message Robert Haas 2010-11-15 13:55:01 Re: Latches with weak memory ordering (Re: max_wal_senders must die)