Re: logical replication empty transactions

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Euler Taveira <euler(at)timbira(dot)com(dot)br>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical replication empty transactions
Date: 2019-11-09 21:28:15
Message-ID: CAMkU=1xcX3Zei7TzamSrgHzP+k4U46Z+Hc4E2yX3JpiehFuNsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 8, 2019 at 8:59 PM Euler Taveira <euler(at)timbira(dot)com(dot)br> wrote:

> Em seg., 21 de out. de 2019 às 21:20, Jeff Janes
> <jeff(dot)janes(at)gmail(dot)com> escreveu:
> >
> > After setting up logical replication of a slowly changing table using
> the built in pub/sub facility, I noticed way more network traffic than made
> sense. Looking into I see that every transaction in that database on the
> master gets sent to the replica. 99.999+% of them are empty transactions
> ('B' message and 'C' message with nothing in between) because the
> transactions don't touch any tables in the publication, only non-replicated
> tables. Is doing it this way necessary for some reason? Couldn't we hold
> the transmission of 'B' until something else comes along, and then if that
> next thing is 'C' drop both of them?
> >
> That is not optimal. Those empty transactions is a waste of bandwidth.
> We can suppress them if no changes will be sent. test_decoding
> implements "skip empty transaction" as you described above and I did
> something similar to it. Patch is attached.
>

Thanks. I didn't think it would be that simple, because I thought we would
need some way to fake an acknowledgement for any dropped empty
transactions, to keep the LSN advancing and allow WAL to get recycled on
the master. But it turns out the opposite. While your patch drops the
network traffic by a lot, there is still a lot of traffic. Now it is
keep-alives, rather than 'B' and 'C'. I don't know why I am getting a few
hundred keep alives every second when the timeouts are at their defaults,
but it is better than several thousand 'B' and 'C'.

My setup here was just to create, publish, and subscribe to a inactive
dummy table, while having pgbench running on the master (with unpublished
tables). I have not created an intentionally slow network, but I am
testing it over wifi, which is inherently kind of slow.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-11-09 21:57:58 Re: int64-timestamp-dependent test vs. --disable-integer-timestamps
Previous Message Peter Eisentraut 2019-11-09 21:13:13 Re: base backup client as auxiliary backend process