Re: Restore-reliability mode

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, pgsql-core <pgsql-core(at)postgresql(dot)org>
Subject: Re: Restore-reliability mode
Date: 2015-06-08 17:44:05
Message-ID: 20150608174405.GJ24173@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 6, 2015 at 03:58:05PM -0400, Noah Misch wrote:
> On Fri, Jun 05, 2015 at 08:25:34AM +0100, Simon Riggs wrote:
> > This whole idea of "feature development" vs reliability is bogus. It
> > implies people that work on features don't care about reliability. Given
> > the fact that many of the features are actually about increasing database
> > reliability in the event of crashes and corruptions it just makes no sense.
>
> I'm contrasting work that helps to keep our existing promises ("reliability")
> with work that makes new promises ("features"). In software development, we
> invariably hazard old promises to make new promises; our success hinges on
> electing neither too little nor too much risk. Two years ago, PostgreSQL's
> track record had placed it in a good position to invest in new, high-risk,
> high-reward promises. We did that, and we emerged solvent yet carrying an
> elevated debt service ratio. It's time to reduce risk somewhat.
>
> You write about a different sense of "reliability." (Had I anticipated this
> misunderstanding, I might have written "Restore-probity mode.") None of this
> was about classifying people, most of whom allocate substantial time to each
> kind of work.
>
> > How will we participate in cleanup efforts? How do we know when something
> > has been "cleaned up", how will we measure our success or failure? I think
> > we should be clear that wasting N months on cleanup can *fail* to achieve a
> > useful objective. Without a clear plan it almost certainly will do so. The
> > flip side is that wasting N months will cause great amusement and dancing
> > amongst those people who wish to pull ahead of our open source project and
> > we should take care not to hand them a victory from an overreaction.
>
> I agree with all that. We should likewise take care not to become insolvent
> from an underreaction.

I understand the overreaction/underreaction debate. Here were my goals
in this discussion:

1. stop worry about the 9.5 timeline so we could honestly assess our
software - *done*
2. seriously address multi-xact issues without 9.5/commit-fest pressure -
*in process*
3. identify any other areas in need of serious work

While I like the list you provided, I don't think we can be effective in
an environment where we assume every big new features will have problems
like multi-xact. For example, we have not seen destabilization from any
major 9.4 features, that I can remember anyway.

Unless there is consensus about new areas for #3, I am thinking we will
continue looking at multi-xact until we are happy, then move ahead with
9.5 items in the way we have before.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-06-08 17:48:36 Re: Restore-reliability mode
Previous Message Alvaro Herrera 2015-06-08 17:43:36 Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file