Re: Strange debug message of walreciver?

From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange debug message of walreciver?
Date: 2015-03-09 01:04:36
Message-ID: CAFcNs+pSXmyR2iZQ24sJyx0A-gJGU5spOc+Yky0Y+WR7L1TGcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 8, 2015 at 8:13 PM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>
> > On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii <ishii(at)postgresql(dot)org>
wrote:
> >>
> >> When I set log_min_messages to debug5 and looked into walreciver log,
> >> I saw this:
> >>
> >> 3600 2015-03-08 09:47:38 JST DEBUG: sendtime 2015-03-08
> > 09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
> > apply delay -1945478837 ms tran sfer latency 0 ms
> >>
> >> The "replication apply delay -1945478837 ms" part looks strange
> >> because the delay is below 0. The number is formatted as %d in elog
> >> call, and I suspect this is kind of integer overflow.
> >>
> >
> > Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed
that
> > the integer overflow occurs because sometimes the return of the
> > GetCurrentChunkReplayStartTime() is 0 (zero).
> >
> > I added an elog into GetReplicationApplyDelay to check this info:
> >
> > DEBUG: GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
> > (0.000000, 479099832352083.000000)
> > DEBUG: sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
> > 00:17:12.352043-03 replication apply delay -1936504800 ms transfer
latency
> > 0 ms
> > DEBUG: GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
> > (0.000000, 479099841450320.000000)
> > DEBUG: sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
> > 00:17:21.450294-03 replication apply delay -1936495702 ms transfer
latency
> > 0 ms
>
> Right. Until walreceiver gets the first WAL record to replay,
> xlogctl->currentChunkStartTime remains 0.
>
> > Maybe we should check before and return zero from
GetReplicationApplyDelay.
> > The attached patch implement it.
>
> Your patch regards 0 replication apply delay in the case above which
> is confusing if the delay is actually 0.
>
> What about something like this to explicitly stats the delay data is
> not available?
>
> elog(DEBUG2, "sendtime %s receipttime %s replication apply delay (N/A)
transfer latency %d ms",
>

Makes sense. Attached patch implement what you suggested.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

Attachment Content-Type Size
fix_strange_debug_message_in_walreceiver_v2.patch text/x-diff 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2015-03-09 01:33:47 Re: Strange debug message of walreciver?
Previous Message Kouhei Kaigai 2015-03-09 00:16:48 Re: Join push-down support for foreign tables