Re: Summary of plans to avoid the annoyance of Freezing

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Summary of plans to avoid the annoyance of Freezing
Date: 2015-09-08 23:21:07
Message-ID: 55EF6D63.9030505@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/6/15 7:25 AM, Andres Freund wrote:
> On 2015-08-10 07:03:02 +0100, Simon Riggs wrote:
>> I was previously a proponent of (2) as a practical way forwards, but my
>> proposal here today is that we don't do anything further on 2) yet, and
>> seek to make progress on 5) instead.
>>
>> If 5) fails to bring a workable solution by the Jan 2016 CF then we commit
>> 2) instead.
>>
>> If Heikki wishes to work on (5), that's good. Otherwise, I think its
>> something I can understand and deliver by 1 Jan, though likely for 1 Nov CF.
>
> I highly doubt that we can get either variant into 9.6 if we only start
> to seriously review them by then. Heikki's lsn ranges patch essentially
> was a variant of 5) and it ended up being a rather complicated patch. I
> don't think using an explicit epoch is going to be that much simpler.
>
> So I think we need to decide now.
>
> My vote is that we should try to get freeze maps into 9.6 - that seems
> more realistic given that we have a patch right now. Yes, it might end
> up being superflous churn, but it's rather localized. I think around
> we've put off significant incremental improvements off with the promise
> of more radical stuff too often.

I'm concerned with how to test this. Right now it's rather difficult to
test things like epoch rollover, especially in a way that would expose
race conditions and other corner cases. We obviously got burned by that
on the MultiXact changes, and a lot of our best developers had to spend
a huge amount of time fixing that. ISTM that a way to unit test things
like CLOG/MXID truncation and visibility logic should be created before
attempting a change like this. Would having this kind of test
infrastructure have helped with the LSN patch development? More
importantly, would it have reduced the odds of the MXID bugs, or made it
easier to diagnose them?

In any case, thanks Simon for the summary. I really like the idea and
will help with it if I can.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2015-09-08 23:33:37 Re: Making tab-complete.c easier to maintain
Previous Message Robert Haas 2015-09-08 22:21:02 Re: Proposal: Implement failover on libpq connect level.