Re: Apparent walsender bug triggered by logical replication

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Apparent walsender bug triggered by logical replication
Date: 2017-06-30 02:46:30
Message-ID: 20499.1498790790@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> writes:
> On 30/06/17 02:07, Tom Lane wrote:
>> I'm also kind of wondering why the "behind the apply" path out of
>> LogicalRepSyncTableStart exists at all; as far as I can tell we'd be much
>> better off if we just let the sync worker exit always as soon as it's done
>> the initial sync, letting any extra catchup happen later. The main thing
>> the current behavior seems to be accomplishing is to monopolize one of the
>> scarce max_sync_workers_per_subscription slots for the benefit of a single
>> table, for longer than necessary. Plus it adds additional complicated
>> interprocess signaling.

> Hmm, I don't understand what you mean here. The "letting any extra
> catchup happen later" would never happen if the sync is behind apply as
> apply has already skipped relevant transactions.

Once the sync worker has exited, we have to have some other way of dealing
with that. I'm wondering why we can't let that other way take over
immediately. The existing approach is inefficient, according to the
traces I've been poring over all day, and frankly I am very far from
convinced that it's bug-free either.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-06-30 02:49:30 Re: protocol version negotiation (Re: Libpq PGRES_COPY_BOTH - version compatibility)
Previous Message Petr Jelinek 2017-06-30 01:31:31 Re: Apparent walsender bug triggered by logical replication