From: | Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Strange debug message of walreciver? |
Date: | 2015-03-09 01:04:36 |
Message-ID: | CAFcNs+pSXmyR2iZQ24sJyx0A-gJGU5spOc+Yky0Y+WR7L1TGcg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Mar 8, 2015 at 8:13 PM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>
> > On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii <ishii(at)postgresql(dot)org>
wrote:
> >>
> >> When I set log_min_messages to debug5 and looked into walreciver log,
> >> I saw this:
> >>
> >> 3600 2015-03-08 09:47:38 JST DEBUG: sendtime 2015-03-08
> > 09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
> > apply delay -1945478837 ms tran sfer latency 0 ms
> >>
> >> The "replication apply delay -1945478837 ms" part looks strange
> >> because the delay is below 0. The number is formatted as %d in elog
> >> call, and I suspect this is kind of integer overflow.
> >>
> >
> > Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed
that
> > the integer overflow occurs because sometimes the return of the
> > GetCurrentChunkReplayStartTime() is 0 (zero).
> >
> > I added an elog into GetReplicationApplyDelay to check this info:
> >
> > DEBUG: GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
> > (0.000000, 479099832352083.000000)
> > DEBUG: sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
> > 00:17:12.352043-03 replication apply delay -1936504800 ms transfer
latency
> > 0 ms
> > DEBUG: GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
> > (0.000000, 479099841450320.000000)
> > DEBUG: sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
> > 00:17:21.450294-03 replication apply delay -1936495702 ms transfer
latency
> > 0 ms
>
> Right. Until walreceiver gets the first WAL record to replay,
> xlogctl->currentChunkStartTime remains 0.
>
> > Maybe we should check before and return zero from
GetReplicationApplyDelay.
> > The attached patch implement it.
>
> Your patch regards 0 replication apply delay in the case above which
> is confusing if the delay is actually 0.
>
> What about something like this to explicitly stats the delay data is
> not available?
>
> elog(DEBUG2, "sendtime %s receipttime %s replication apply delay (N/A)
transfer latency %d ms",
>
Makes sense. Attached patch implement what you suggested.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
Attachment | Content-Type | Size |
---|---|---|
fix_strange_debug_message_in_walreceiver_v2.patch | text/x-diff | 2.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2015-03-09 01:33:47 | Re: Strange debug message of walreciver? |
Previous Message | Kouhei Kaigai | 2015-03-09 00:16:48 | Re: Join push-down support for foreign tables |