Re: [CORE] Restore-reliability mode

From: Andres Freund <andres(at)anarazel(dot)de>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, pgsql-core <pgsql-core(at)postgresql(dot)org>
Subject: Re: [CORE] Restore-reliability mode
Date: 2015-06-05 15:36:41
Message-ID: 20150605153641.GZ30287@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-06-05 11:05:14 -0400, Bruce Momjian wrote:
> To release 9.5 beta would be to get back into that cycle, and I am not
> sure we are ready for that. I think the fact we have multiple people
> all reviewing the multi-xact code now (and not dealing with 9.5) is a
> good sign. If we were focused on 9.5 beta, I doubt this would have
> happened.

At least form me that I'm working on multixacts right now has nothing to
do with to beta or not to beta.

And I don't understand why releasing an alpha or beta would detract from
that right now. We need more people doing crazy shit with our codebase,
not fewer.

None of the master-only issues is a blocker for an alpha, so besides
some release work within the next two weeks I don't see what'd detract
us that much?

> I am saying let's make sure we are not deficient in other areas, then
> let's move forward again.

I don't think we actually can do that. The problem of the multixact
stuff is precisely that it looked so innocent that a bunch of
experienced people just didn't see the problem. Omniscience is easy in
hindsight.

> I would love to think we can do multiple things at once, but for
> multi-xact, serious review didn't happen for 18 months, so if slowing
> release development is what is required, I support it.

FWIW, I can stomach a week or four of doing bugfix only stuff. After
that I'm simply not going to be efficient at that anymore. And I
seriously doubt that I'm the only one like that. Doing the same thing
for weeks makes you miss obvious stuff.

I don't think anything as localized as 'do nothing but bugfixes for a
while and then carry on' actually will solve the problem. We need to
find and reallocate resources to put more emphasis on review, robustness
and refactoring in the long term, not do panick-y stuff short term. This
isn't a problem that can be solved by focusing on bugfixing for a week
or four.

That means we have to convince employers to actually *pay* us (people
experienced with the codebase) to do work on these kind of things
instead of much-easier-to-market new features. A lot of
review/robustness work has been essentially done in our spare time,
after long days. Which means the employers need to get more people.

> Sure. I think everyone agrees the multi-xact work is all good, so I am
> asking what else needs this kind of research. If there is nothing else,
> we can move forward again --- I am just saying we need to ask the
> reliability question _first_.

I'm starting to get grumpy here. You've called for review in lots of
emails now. Let's get going then?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-06-05 15:39:38 Re: [CORE] Restore-reliability mode
Previous Message David Fetter 2015-06-05 15:28:21 Re: RFC: Remove contrib entirely