Re: [CORE] Restore-reliability mode

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, pgsql-core <pgsql-core(at)postgresql(dot)org>
Subject: Re: [CORE] Restore-reliability mode
Date: 2015-06-05 04:28:33
Message-ID: CAB7nPqQsBAKuCWSqd834LwC0T+g8=yzD1GnPts6oMe4Ewrpjbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 5, 2015 at 8:53 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
>
>
> On 4 June 2015 at 22:43, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>
>> Josh,
>>
>> * Josh Berkus (josh(at)agliodbs(dot)com) wrote:
>> > I would argue that if we delay 9.5 in order to do a 100% manual review
>> > of code, without adding any new automated tests or other non-manual
>> > tools for improving stability, then it's a waste of time; we might as
>> > well just release the beta, and our users will find more issues than we
>> > will. I am concerned that if we declare a cleanup period, especially in
>> > the middle of the summer, all that will happen is that the project will
>> > go to sleep for an extra three months.
>>
>> This is the exact same concern that I have. A delay just to have a
>> delay is not useful. I completely agree that we need more automated
>> testing, etc, though getting all of that set up and running could be
>> done at any time too- there's no reason to wait, nor do I believe
>> delaying 9.5 would make such automated testing appear.
>>
>
> In terms of specific testing improvements, things I think we need to have
> covered and runnable on the buildfarm are:
>
> * pg_dump and pg_restore testing (because it's scary we don't do this)

We do test it in some way with pg_upgrade using set of objects that
are not removed by the regression test suite. Extension dumps are
uncovered yet though.

> * WAL archiving based warm standby testing with promotion
> * Two node streaming replication with promotion, both with a slot and with
> archive fallback
> * Three node cascading streaming replication with middle node promotion then
> tail end node promotion
> * Logical decoding streaming testing, comparing to expected decoded output
> * hard-kill the postmaster, start up from crashed datadir
> * pg_basebackup + start up from backup
> * pg_start_backup, rsync, pg_stop_backup, start up in hot standby
> * Tests of crash recovery during various DDL operations

Well, steps in this direction are the point of this patch, the
replication test suite:
https://commitfest.postgresql.org/5/197/
And this one, addition of Windows support for TAP tests:
https://commitfest.postgresql.org/5/207/

> * DDL deparse test coverage for all operations

What do you have in mind except what is already in objectaddress.sql
and src/test/modules/test_dll_deparse/?

> * disk exhaustion tests both for pg_xlog and for the main datadir, showing
> we can recover OK when disk is filled then space is freed

This may be tricky. How would you emulate that?

> Is pg_tap a reasonable starting point for this sort of testing?

IMO, using the TAP machinery would be a good base for that. What lacks
is a basic set of perl routines that one can easily use to set of test
scenarios.

> How would a test that would've caught the multixact issues look?

I have not followed closely those discussions, not sure about that.

Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-06-05 05:21:28 Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file
Previous Message Andrew Dunstan 2015-06-05 04:27:06 Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file