Re: Logical Replication WIP

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Steve Singer <steve(at)ssinger(dot)info>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Logical Replication WIP
Date: 2016-11-28 21:20:04
Message-ID: 8629c944-7dbf-f2be-7726-f0618099667a@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27/11/16 23:54, Petr Jelinek wrote:
> On 27/11/16 23:42, Erik Rijkers wrote:
>> On 2016-11-27 19:57, Petr Jelinek wrote:
>>> On 22/11/16 18:42, Erik Rijkers wrote:
>>>> A crash of the subscriber can be forced by running vacuum <published
>>>> table> on the publisher.
>>>>
>>>>
>>>> - publisher
>>>> create table if not exists testt( id integer primary key, c text );
>>>> create publication pub1 for table testt;
>>>>
>>>> - subscriber
>>>> create table if not exists testt( id integer primary key, c text );
>>>> create subscription sub1 connection 'dbname=testdb port=6444'
>>>> publication pub1 with (disabled);
>>>> alter subscription sub1 enable;
>>>>
>>>> - publisher
>>>> vacuum testt;
>>>>
>>>> now data change on the published table, (perhaps also a select on the
>>>> subscriber-side data) leads to:
>>>>
>>>>
>>>> - subscriber log:
>>>> TRAP: FailedAssertion("!(pointer != ((void *)0))", File: "mcxt.c", Line:
>>>> 1001)
>>
>>>
>>> I very much doubt this is problem of vacuum as it does not send anything
>>> to subscriber. Is there anything else you did on those servers?
>>>
>>
>> It is not the vacuum that triggers the crash but the data change (insert
>> or delete, on the publisher) /after/ that vacuum.
>>
>> Just now, I compiled 2 instances from master and such a crash (after
>> vacuum + delete) seems reliable here.
>>
>> (If you can't duplicate such a crash let me know; then I'll dig out more
>> precise set-up detail)
>>
>
> I found the reason. It's not just vacuum (which was what confused me)
> it's when the publishing side sends the info about relation again (which
> happens when there was cache invalidation on the relation and then new
> data were written) and I did free one pointer that I never set. I'll
> send fixed patch tomorrow.
> Thanks!
>

Okay, so here it is, I also included your doc fix, added test for
REPLICA IDENTITY FULL (which also tests this issue as side effect) and
fixed one relcache leak.

I also rebased it against current master as there was some conflict in
the bgworker.c.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Add-support-for-TEMPORARY-replication-slots-v9.patch.gz application/gzip 7.6 KB
0002-Refactor-libpqwalreceiver-v9.patch.gz application/gzip 8.8 KB
0003-Add-PUBLICATION-catalogs-and-DDL-v9.patch.gz application/gzip 29.1 KB
0004-Add-SUBSCRIPTION-catalog-and-DDL-v9.patch.gz application/gzip 26.5 KB
0005-Define-logical-replication-protocol-and-output-plugi-v9.patch.gz application/gzip 12.7 KB
0006-Add-logical-replication-workers-v9.patch.gz application/gzip 41.6 KB
0007-Add-separate-synchronous-commit-control-for-logical--v9.patch.gz application/gzip 1.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2016-11-28 21:20:22 Re: Time to up bgwriter_lru_maxpages?
Previous Message Tom Lane 2016-11-28 21:07:42 Re: PSQL commands: \quit_if, \quit_unless