Quick Links

9.4 logical replication - walsender keepalive replies

From:	Steve Singer <steve(at)ssinger(dot)info>
To:	PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Andres Freund <andres(at)2ndquadrant(dot)com>
Subject:	9.4 logical replication - walsender keepalive replies
Date:	2014-06-30 15:40:50
Message-ID:	BLU436-SMTP25712B7EF9FC2ADEB87C522DC040@phx.gbl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

In 9.4 we've the below block of code to walsender.c as

/*
* We only send regular messages to the client for full decoded
* transactions, but a synchronous replication and walsender shutdown
* possibly are waiting for a later location. So we send pings
* containing the flush location every now and then.
*/
if (MyWalSnd->flush < sentPtr && !waiting_for_ping_response)
{
WalSndKeepalive(true);
waiting_for_ping_response = true;
}

I am finding that my logical replication reader is spending a tremendous
amount of time sending feedback to the server because a keep alive reply
was requested. My flush pointer is smaller than sendPtr, which I see as
the normal case (The client hasn't confirmed all the wal it has been
sent). My client queues the records it receives and only confirms when
actually processes the record.

So the sequence looks something like

Server Sends LSN 0/1000
Server Sends LSN 0/2000
Server Sends LSN 0/3000
Client confirms LSN 0/2000

I don't see why all these keep alive replies are needed in this case
(the timeout value is bumped way up, it's the above block that is
triggering the reply request not something related to timeout)

Steve

Responses

Re: 9.4 logical replication - walsender keepalive replies at 2014-07-06 14:11:19 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2014-06-30 16:15:06	Re: better atomics - v0.5
Previous Message	Robert Haas	2014-06-30 15:38:31	Re: Spinlocks and compiler/memory barriers