Re: Question about WalSndWriteData

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: k(dot)knizhnik(at)postgrespro(dot)ru
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, peter(dot)eisentraut(at)2ndquadrant(dot)com
Subject: Re: Question about WalSndWriteData
Date: 2018-03-30 05:25:32
Message-ID: 20180330.142532.00144570.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Thu, 29 Mar 2018 20:11:03 +0300, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> wrote in <04171363-a07d-62b7-4362-32de5414941c(at)postgrespro(dot)ru>
>
>
> On 21.03.2018 10:08, Konstantin Knizhnik wrote:
> >
> >
> > On 21.03.2018 04:50, Peter Eisentraut wrote:
> >> On 3/16/18 12:08, Konstantin Knizhnik wrote:
> >>> pq_putmessage_noblock copies data from ctx->out buffer to libpq
> >>> buffers.
> >>> After it we write timestamp to ctx->out buffer.
> >>> And comments says that we should do it "as late as possible".
> >>> But this timestamp is not included in the copy data packet which is
> >>> already copied to libpq connection buffer.
> >> There is a pq_flush_if_writable() right after this that will write out
> >> the rest of ctx->out.
> >>
> > Sorry, But PQ_flush_if_writable calls socket_flush_if_writable (or
> > mq_flush_if_writable).
> > This function flushes pqlib connection buffer, i.e. PqSendBuffer.
> > This buffer has no relation to ctx->out_buffer, where timestamp is
> > written.
> >
> > The obvious fix is to move assignment of timestamp prior to
> > pq_putmessage_noblock:

Yeah. The content of ctx->out is already copied into the buffer
inside libpq. So rewriting ctx->out doesn't have an effeict after
putmessage. Actually LogicalRepApplyLoop always receives packets
with send_time = 0! I overlooked that while discussing there..

It is also set by keep-alive message so we don't observe
pg_stat_subscription.last_msg_send_time to be null so often but
we can surely see it. (it is artificially caused by inserting a
wait just after UpdateWorkerStats call for 'w' message in
LogicalRepApplyLoop)

select * from pg_stat_subscription ;
-[ RECORD 1 ]---------+------------------------------
subid | 16390
subname | sub1
pid | 17191
relid |
received_lsn | 0/1C0FDAE0
last_msg_send_time |
last_msg_receipt_time | 2018-03-30 14:18:55.270096+09
latest_end_lsn | 0/1C0FDA70
latest_end_time | 2018-03-30 14:18:48.268534+09

> Sorry, I have not received confirmation whether it is a bug or not and
> is it going to be fixed.

So, my diagnosis is that this is apparently a bug, which should
be fixed as Konstantin is proposing. And it should be back-patched?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-03-30 05:38:44 Re: [HACKERS] path toward faster partition pruning
Previous Message Andres Freund 2018-03-30 05:03:51 Re: pgsql: Rewrite the code that applies scan/join targets to paths.