Re: Logical replication wal sender timestamp bug

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical replication wal sender timestamp bug
Date: 2019-11-05 04:19:37
Message-ID: 20191105041937.GG16867@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 02, 2019 at 09:54:54PM -0400, Jeff Janes wrote:
> While monitoring pg_stat_subscription, I noticed that last_msg_send_time
> was usually NULL, which doesn't make sense. Why would we have a message,
> but not know when it was sent?

So... The timestamp is received and stored in LogicalRepApplyLoop()
with send_time when a 'w' message is received in the subscription
cluster. And it gets computed with a two-phase process:
- WalSndPrepareWrite() reserves the space in the message for the
timestamp.
- WalSndWriteData() computes the timestamp in the reserved space once
the write message is computed and ready to go.

> Filling out the timestamp after the message has already been sent is taking
> "as late as possible" a little too far. It results in every message having
> a zero timestamp, other than keep-alives which go through a different path.

It seems to me that you are right: the timestamp is computed too
late.

> Re-ordering the code blocks as in the attached seems to fix it. But I have
> to wonder, if this has been broken from the start and no one ever noticed,
> is this even valuable information to relay in the first place? We could
> just take the column out of the view, and not bother calling
> GetCurrentTimestamp() for each message.

I think that there are use cases for such monitoring capabilities,
see for example 7fee252. So I would rather keep it.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-11-05 04:59:25 Re: Keep compiler silence (clang 10, implicit conversion from 'long' to 'double' )
Previous Message Michael Paquier 2019-11-05 03:41:38 Re: TAP tests aren't using the magic words for Windows file access