Re: Strange debug message of walreciver?

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: fabriziomello(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange debug message of walreciver?
Date: 2015-03-08 23:13:24
Message-ID: 20150309.081324.1269469399784244003.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>>
>> When I set log_min_messages to debug5 and looked into walreciver log,
>> I saw this:
>>
>> 3600 2015-03-08 09:47:38 JST DEBUG: sendtime 2015-03-08
> 09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
> apply delay -1945478837 ms tran sfer latency 0 ms
>>
>> The "replication apply delay -1945478837 ms" part looks strange
>> because the delay is below 0. The number is formatted as %d in elog
>> call, and I suspect this is kind of integer overflow.
>>
>
> Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed that
> the integer overflow occurs because sometimes the return of the
> GetCurrentChunkReplayStartTime() is 0 (zero).
>
> I added an elog into GetReplicationApplyDelay to check this info:
>
> DEBUG: GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
> (0.000000, 479099832352083.000000)
> DEBUG: sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
> 00:17:12.352043-03 replication apply delay -1936504800 ms transfer latency
> 0 ms
> DEBUG: GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
> (0.000000, 479099841450320.000000)
> DEBUG: sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
> 00:17:21.450294-03 replication apply delay -1936495702 ms transfer latency
> 0 ms

Right. Until walreceiver gets the first WAL record to replay,
xlogctl->currentChunkStartTime remains 0.

> Maybe we should check before and return zero from GetReplicationApplyDelay.
> The attached patch implement it.

Your patch regards 0 replication apply delay in the case above which
is confusing if the delay is actually 0.

What about something like this to explicitly stats the delay data is
not available?

elog(DEBUG2, "sendtime %s receipttime %s replication apply delay (N/A) transfer latency %d ms",

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-03-08 23:30:36 Re: Clamping reulst row number of joins.
Previous Message Tomas Vondra 2015-03-08 22:01:33 PATCH: pgbench - logging aggregated info and transactions at the same time