Re: orangutan seizes up during isolation-check

From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, davec(at)postgresintl(dot)com, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: orangutan seizes up during isolation-check
Date: 2015-01-02 04:04:44
Message-ID: 20150102040444.GA2212447@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 31, 2014 at 01:56:08PM -0500, Noah Misch wrote:
> On Wed, Dec 31, 2014 at 12:32:37AM -0500, Robert Haas wrote:
> > On Sun, Dec 28, 2014 at 4:58 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > > I wondered whether to downgrade FATAL to LOG in back branches. Introducing a
> > > new reason to block startup is disruptive for a minor release, but having the
> > > postmaster deadlock at an unpredictable later time is even more disruptive. I
> > > am inclined to halt startup that way in all branches.
> >
> > Jeepers. I'd rather not do that. From your report, this problem has
> > been around for years. Yet, as far as I know, it's bothering very few
> > real users, some of whom might be far more bothered by the postmaster
> > suddenly failing to start. I'm fine with a FATAL in master, but I'd
> > vote against doing anything that might prevent startup in the
> > back-branches without more compelling justification.
>
> Clusters hosted on OS X fall into these categories:
>
> 1) Unaffected configuration. This includes everyone setting a valid messages
> locale via LANG, LC_ALL or LC_MESSAGES.
> 2) Affected configuration. Through luck and light use, the cluster would not
> experience the crashes/hangs.
> 3) Cluster would experience the crashes/hangs.
>
> DBAs in (3) want the FATAL at startup, but those in (2) want a LOG message
> instead. DBAs in (1) don't care. Since intermittent postmaster hangs are far
> worse than startup failure, if (2) and (3) have similar population, FATAL is
> the better bet. If (2) is sufficiently more populous than (3), then the many
> small pricks from startup failure do add up to hurt more than the occasional
> postmaster hang. Who knows how that calculation plays out.

The first attached patch, for all branches, adds LOG-level messages and an
assertion. So cassert builds will fail hard, while others won't. The second
patch, for master only, changes the startup-time message to FATAL. If we
decide to use FATAL in all branches, I would just squash them into one.

Attachment Content-Type Size
darwin-check-multithreaded-v1.patch text/plain 4.3 KB
darwin-multithreaded-fatal-v1.patch text/plain 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip kumar 2015-01-02 06:17:24 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Previous Message Noah Misch 2015-01-02 03:54:41 Re: orangutan seizes up during isolation-check