Re: Postgres Stale Statistics

From: Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com>
To: Adelino Silva <adelino(dot)silva(at)pt(dot)ibm(dot)com>
Cc: Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: Postgres Stale Statistics
Date: 2022-05-10 18:14:02
Message-ID: CAFpL5VyuYPS9on8ANA8vQ4Fdr18NWGvdGcD2YnB-ZdVFtPHwvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi,

Any inputs on how we can debug this further?

Thanks,
Nikhil

On Sun, 8 May 2022 at 12:12 PM, Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com>
wrote:

> Hi Adelino,
>
> About the EAGAIN (Resource temporarily unavailable).
>> UDP is a stateless protocol, unlike TCP which is connection oriented. The
>> recvfrom() code will not know whether or not the sender has closed its
>> socket, it only knows whether or not there is data waiting to be read.
>> According to the man page for recvfrom on Linux:
>> If no messages are available at the socket, the receive calls wait
>> for a message to arrive, unless the socket is nonblocking (see fcntl(2)) in
>> which case the value -1 is returned and the external variable errno set to
>> EAGAIN.
>
>
> Are you saying the message '-1 EAGAIN (Resource temporarily unavailable)'
> is normal?
>
> may you need to explore other option like disk saturation.
>> using stale statistics instead of current ones because stats collector is
>> not responding
>> <https://opensourcedbtech.com/2018/04/03/using-stale-statistics-instead-of-current-ones-because-stats-collector-is-not-responding/>
>
>
> There is no disk saturation from what we see.
>
> Thanks,
> Nikhil
>
> On Thu, Apr 28, 2022 at 2:52 PM Adelino Silva <adelino(dot)silva(at)pt(dot)ibm(dot)com>
> wrote:
>
>> Hi Nikhil,
>>
>> About the EAGAIN (Resource temporarily unavailable).
>> UDP is a stateless protocol, unlike TCP which is connection oriented. The
>> recvfrom() code will not know whether or not the sender has closed its
>> socket, it only knows whether or not there is data waiting to be read.
>> According to the man page for recvfrom on Linux:
>>
>> If no messages are available at the socket, the receive calls wait
>> for a message to arrive, unless the socket is nonblocking (see fcntl(2)) in
>> which case the value -1 is returned and the external variable errno set to
>> EAGAIN.
>>
>>
>> may you need to explore other option like disk saturation.
>> using stale statistics instead of current ones because stats collector is
>> not responding
>> <https://opensourcedbtech.com/2018/04/03/using-stale-statistics-instead-of-current-ones-because-stats-collector-is-not-responding/>
>>
>> Regards,
>>
>> Adelino Silva
>> ------------------------------
>> *From:* Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com>
>> *Sent:* Wednesday, April 27, 2022 4:36 PM
>> *To:* Adelino Silva <adelino(dot)silva(at)pt(dot)ibm(dot)com>
>> *Cc:* Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
>> *Subject:* [EXTERNAL] Re: Postgres Stale Statistics
>>
>> Hi Adelino, I went through the article and I see there is no issue with
>> IPv6 in our case, it is using IPv4. I used strace and found 'Resource
>> temporarily unavailable' error though, not sure what this means, does this
>> mean there is an
>> ZjQcmQRYFpfptBannerStart
>> This Message Is From an External Sender
>> This message came from outside your organization.
>>
>> ZjQcmQRYFpfptBannerEnd
>> Hi Adelino,
>>
>> I went through the article and I see there is no issue with IPv6 in our
>> case, it is using IPv4.
>>
>>
>> I used strace and found 'Resource temporarily unavailable' error though,
>> not sure what this means, does this mean there is an issue with disk I/O?
>>
>> strace: Process 5134 attached
>>
>> epoll_wait(3, [{EPOLLIN, {u32=31860224, u64=31860224}}], 1, -1) = 1
>>
>> close(3) = 0
>>
>> recvfrom(10, "\2\0\0\0\230\0\0\0\7(at)\0\0\1\0\0\0\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>> 1000, 0, NULL, NULL) = 152
>>
>> recvfrom(10, 0x7ffeeb967fa0, 1000, 0, NULL, NULL) = -1 EAGAIN (Resource
>> temporarily unavailable)
>>
>> epoll_create1(EPOLL_CLOEXEC) = 3
>>
>> epoll_ctl(3, EPOLL_CTL_ADD, 11, {EPOLLIN|EPOLLERR|EPOLLHUP,
>> {u32=31860176, u64=31860176}}) = 0
>>
>> epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=31860200,
>> u64=31860200}}) = 0
>>
>> epoll_ctl(3, EPOLL_CTL_ADD, 10, {EPOLLIN|EPOLLERR|EPOLLHUP,
>> {u32=31860224, u64=31860224}}) = 0
>>
>> epoll_wait(3, [{EPOLLIN, {u32=31860224, u64=31860224}}], 1, -1) = 1
>>
>> close(3) = 0
>>
>> recvfrom(10, "\2\0\0\0\250\3\0\0\7(at)\0\0\10\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>> 1000, 0, NULL, NULL) = 936
>>
>> recvfrom(10,
>> "\2\0\0\0\250\3\0\0\0\0\0\0\10\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>> 1000, 0, NULL, NULL) = 936
>>
>> recvfrom(10, "\2\0\0\0x\1\0\0\7(at)\0\0\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>> 1000, 0, NULL, NULL) = 376
>>
>> recvfrom(10, 0x7ffeeb967fa0, 1000, 0, NULL, NULL) = -1 EAGAIN (Resource
>> temporarily unavailable)
>>
>> epoll_create1(EPOLL_CLOEXEC) = 3
>>
>> epoll_ctl(3, EPOLL_CTL_ADD, 11, {EPOLLIN|EPOLLERR|EPOLLHUP,
>> {u32=31860176, u64=31860176}}) = 0
>>
>> epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=31860200,
>> u64=31860200}}) = 0
>>
>> epoll_ctl(3, EPOLL_CTL_ADD, 10, {EPOLLIN|EPOLLERR|EPOLLHUP,
>> {u32=31860224, u64=31860224}}) = 0
>>
>> epoll_wait(3, [{EPOLLIN, {u32=31860224, u64=31860224}}], 1, -1) = 1
>>
>> close(3)
>>
>>
>> Regards,
>>
>> Nikhil
>>
>> On Wed, Apr 27, 2022 at 8:25 PM Adelino Silva <adelino(dot)silva(at)pt(dot)ibm(dot)com>
>> wrote:
>>
>> One possible cause for this problem is that the statistics collector
>> process is bound to an IP:port which is not responding.
>> See the following thread discussion.
>>
>>
>> https://stackoverflow.com/questions/46008372/using-stale-statistics-instead-of-current-ones
>>
>> <https://stackoverflow.com/questions/46008372/using-stale-statistics-instead-of-current-ones>
>> Using stale statistics instead of current ones - Stack Overflow
>> <https://stackoverflow.com/questions/46008372/using-stale-statistics-instead-of-current-ones>
>> Teams. Q&A for work. Connect and share knowledge within a single location
>> that is structured and easy to search. Learn more
>> stackoverflow.com
>>
>> Regards,
>>
>> Adelino Silva
>>
>> ------------------------------
>> *From:* Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com>
>> *Sent:* Wednesday, April 27, 2022 2:49 PM
>> *To:* Adelino Silva <adelino(dot)silva(at)pt(dot)ibm(dot)com>
>> *Cc:* Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
>> *Subject:* [EXTERNAL] Re: Postgres Stale Statistics
>>
>> Hi Adelino, I had gone through that thread before, we cannot move the
>> stats to RAM as of now. Thanks, Nikhil On Wed, Apr 27, 2022 at 6:16 PM
>> Adelino Silva <adelino(dot)silva(at)pt(dot)ibm(dot)com> wrote: Hi, Found this thread
>> that explains the warning.
>> ZjQcmQRYFpfptBannerStart
>> This Message Is From an External Sender
>> This message came from outside your organization.
>>
>> ZjQcmQRYFpfptBannerEnd
>> Hi Adelino,
>>
>> I had gone through that thread before, we cannot move the stats to RAM as
>> of now.
>>
>> Thanks,
>> Nikhil
>>
>> On Wed, Apr 27, 2022 at 6:16 PM Adelino Silva <adelino(dot)silva(at)pt(dot)ibm(dot)com>
>> wrote:
>>
>> Hi,
>>
>> Found this thread that explains the warning.
>> using stale statistics instead of current ones because stats collector is
>> not responding
>>
>> https://www.postgresql.org/message-id/1457523467.24545.43.camel@2ndquadrant.com
>>
>> <https://www.postgresql.org/message-id/1457523467.24545.43.camel@2ndquadrant.com>
>> PostgreSQL: Re: using stale statistics instead of current ones because
>> stats collector is not responding
>> <https://www.postgresql.org/message-id/1457523467.24545.43.camel@2ndquadrant.com>
>> Hi, On Tue, 2016-03-08 at 16:18 -0800, Tory M Blue wrote: > No hits on
>> the intratubes on this. > …
>> www.postgresql.org
>>
>>
>> Regards,
>>
>> Adelino Silva
>>
>> ------------------------------
>> *From:* Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com>
>> *Sent:* Wednesday, April 27, 2022 12:08 PM
>> *To:* Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
>> *Subject:* [EXTERNAL] Postgres Stale Statistics
>>
>> Hi, We are getting below WARNING on one of the standby instances. Not
>> sure what caused it but to resolve it we tried restarting the database
>> instances but it is still not working WARNING - using stale statistics
>> instead of current ones because
>> ZjQcmQRYFpfptBannerStart
>> This Message Is From an External Sender
>> This message came from outside your organization.
>>
>> ZjQcmQRYFpfptBannerEnd
>> Hi,
>>
>> We are getting below WARNING on one of the standby instances. Not sure
>> what caused it but to resolve it we tried restarting the database instances
>> but it is still not working
>>
>>
>> WARNING - using stale statistics instead of current ones because stats
>> collector is not responding
>>
>>
>> Postgresql version - 11.7
>>
>>
>> Any other option to resolve this? We are thinking of building the standby
>> again but what if the WARNING is for a primary database instance and a
>> restart won't solve it?
>>
>>
>> Thanks and Regards,
>>
>> Nikhil
>>
>>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Albin Ary 2022-05-10 20:23:02 Tuning Linux for Postgresql - Database failed to start
Previous Message Nathan Bossart 2022-05-10 16:12:49 Re: Estimating HugePages Requirements?