Re: Help required to debug pg_repack breaking logical replication

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Help required to debug pg_repack breaking logical replication
Date: 2017-10-08 13:21:03
Message-ID: CAMsr+YFt_J6BpiYEv6Ri90h6mLGZpada5zZ0kG6O+HbN2zUEcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 October 2017 at 02:37, Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com> wrote:
> Hello,
>
> we have been reported, and I have experienced a couple of times,
> pg_repack breaking logical replication.
>
> - https://github.com/reorg/pg_repack/issues/135
> - https://github.com/2ndQuadrant/pglogical/issues/113

Yeah, I was going to say I've seen reports of this with pglogical, but
I see you've linked to them.

I haven't had a chance to look into it though, and haven't had a
suitable reproducible test case.

> In the above issue #113, Petr Jelinek commented:
>
>> From quick look at pg_repack, the way it does table rewrite is almost guaranteed
>> to break logical decoding unless there is zero unconsumed changes for a given table
>> as it does not build the necessary mappings info for logical decoding that standard
>> heap rewrite in postgres does.
>
> unfortunately he didn't follow up to further details requests.

At a guess he's referring to src/backend/access/heap/rewriteheap.c .

I'd explain better if I understood what was going on myself, but I
haven't really understood the logical decoding parts of that code.

> - Is Petr diagnosis right and freezing of logical replication is to be
> blamed to missing mapping?
> - Can you suggest a test to reproduce the issue reliably?
> - What are mapped relations anyway?

I can't immediately give you the answers you seek, but start by
studying src/backend/access/heap/rewriteheap.c . Notably
logical_end_heap_rewrite, logical_rewrite_heap_tuple,
logical_begin_heap_rewrite.

At a wild "I haven't read any of the relevant code in detail yet" stab
in the dark, pg_repack is failing to do the bookkeeping required by
logical decoding around relfilenode changes, cmin/cmax, etc.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-10-08 15:11:57 Re: On markers of changed data
Previous Message Craig Ringer 2017-10-08 13:00:31 Re: Slow synchronous logical replication