Re: logical replication - still unstable after all these months

From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Erik Rijkers <er(at)xs4all(dot)nl>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: logical replication - still unstable after all these months
Date: 2017-06-05 01:08:12
Message-ID: 221f0df9-2af6-4a4f-8c4c-7fb5509118e3@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/06/17 00:04, Erik Rijkers wrote:

> On 2017-05-31 16:20, Erik Rijkers wrote:
>> On 2017-05-31 11:16, Petr Jelinek wrote:
>> [...]
>>> Thanks to Mark's offer I was able to study the issue as it happened and
>>> found the cause of this.
>>>
>>> [0001-Improve-handover-logic-between-sync-and-apply-worker.patch]
>>
>> This looks good:
>>
>> -- out_20170531_1141.txt
>> 100 -- pgbench -c 90 -j 8 -T 60 -P 12 -n -- scale 25
>> 100 -- All is well.
>>
>> So this is 100x a 1-minute test with 100x success. (This on the most
>> fastidious machine (slow disks, meagre specs) that used to give 15%
>> failures)
>
> [Improve-handover-logic-between-sync-and-apply-worker-v2.patch]
>
> No errors after (several days of) running variants of this. (2500x 1
> minute runs; 12x 1-hour runs)

Same here, no errors with the v2 patch applied (approx 2 days - all 1
minute runs)

regards

Mark

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jing Wang 2017-06-05 01:09:03 Support to COMMENT ON DATABASE CURRENT_DATABASE
Previous Message Tom Lane 2017-06-04 23:26:41 Re: Should we standardize on a type for signal handler flags?