Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amir Rohan <amir(dot)rohan(at)zoho(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amir Rohan <amir(dot)rohan(at)mail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <gsmith(at)gregsmith(dot)com>
Subject: Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Date: 2015-10-08 07:39:29
Message-ID: CAB7nPqS_G_DD6pq_1jjFi33_mniYFJsyP6Y85jhtYJHXJaO1YQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 8, 2015 at 3:59 PM, Amir Rohan <amir(dot)rohan(at)zoho(dot)com> wrote:
> On 10/08/2015 08:19 AM, Michael Paquier wrote:
>> On Wed, Oct 7, 2015 at 5:44 PM, Amir Rohan wrote:
>>> On 10/07/2015 10:29 AM, Michael Paquier wrote:
>>>> On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan wrote:
>>>>> Also, the removal of poll_query_until from pg_rewind looks suspiciously
>>>>> like a copy-paste gone bad. Pardon if I'm missing something.
>>>>
>>>> Perhaps. Do you have a suggestion regarding that? It seems to me that
>>>> this is more useful in TestLib.pm as-is.
>>>>
>>>
>>> My mistake, the patch only shows some internal function being deleted
>>> but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is
>>> a better place for it.
>>
>> OK. Here is a new patch version. I have removed the restriction
>> preventing to call make_master multiple times in the same script (one
>> may actually want to test some stuff related to logical decoding or
>> FDW for example, who knows...), forcing PGHOST to always use the same
>> value after it has been initialized. I have added a sanity check
>> though, it is not possible to create a node based on a base backup if
>> no master are defined. This looks like a cheap insurance... I also
>> refactored a bit the code, using the new init_node_info to fill in the
>> fields of a newly-initialized node, and I removed get_free_port,
>> init_node, init_node_from_backup, enable_restoring and
>> enable_streaming from the list of routines exposed to the users, those
>> can be used directly with make_master, make_warm_standby and
>> make_hot_standby. We could add them again if need be, somebody may
>> want to be able to get a free port, set up a node without those
>> generic routines, just that it does not seem necessary now.
>> Regards,
>>
>
> If you'd like, I can write up some tests for cascading replication which
> are currently missing.

001 is testing cascading, like that node1 -> node2 -> node3.

> Someone mentioned a daisy chain setup which sounds fun. Anything else in
> particular? Also, it would be nice to have some canned way to measure
> end-to-end replication latency for variable number of nodes.

Hm. Do you mean comparing the LSN position between two nodes even if
both nodes are not connected to each other? What would you use it for?

> What about going back through the commit log and writing some regression
> tests for the real stinkers, if someone care to volunteer some candidate
> bugs

I have drafted a list with a couple of items upthread:
http://www.postgresql.org/message-id/CAB7nPqSgffSPhOcrhFoAsDAnipvn6WsH2nYkf1KayRm+9_MTGw@mail.gmail.com
So based on the existing patch the list becomes as follows:
- wal_retrieve_retry_interval with a high value, say setting to for
example 2/3s and loop until it is applied by checking it is it has
been received by the standby every second.
- recovery_target_action
- archive_cleanup_command
- recovery_end_command
- pg_xlog_replay_pause and pg_xlog_replay_resume
In the list of things that could have a test, I recall that we should
test as well 2PC with the recovery delay, look at a1105c3d. This could
be included in 005.
The advantage of implementing that now is that we could see if the
existing routines are solid enough or not. Still, looking at what the
patch has now I think that we had better get a committer look at it,
and if the core portion gets integrated we could already use it for
the patch implementing quorum synchronous replication and in doing
more advanced tests with pg_rewind regarding the timeline handling
(both patches of this CF). I don't mind adding more now, though I
think that the set of sample tests included in this version is enough
as a base implementation of the facility and shows what it can do.
Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Taiki Kondo 2015-10-08 08:28:04 Re: [Proposal] Table partition + join pushdown
Previous Message Amir Rohan 2015-10-08 06:59:05 Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.