Re: Function to track shmem reinit time

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Function to track shmem reinit time
Date: 2018-03-04 16:02:32
Message-ID: 1247478e-b106-9c82-0e23-d0dedbd47a78@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/28/2018 01:11 PM, Anastasia Lubennikova wrote:
> Attached patch introduces a new function pg_shmem_init_time(),
> which returns the time shared memory was last (re)initialized.
> It is created for use by monitoring tools to track backend crashes.
>
> Currently, if the 'restart_after_crash' option is on, postgres will
> just restart. And the only way to know that it happened is to
> regularly parse logfile or monitor it, catching restart messages.
> This approach is really inconvenient for users, who have gigabytes of
> logs.
>
> This new function can be periodiacally called by a monitoring agent,
> and, if /shmem_init_time/ doesn't match /pg_postmaster_start_time,/
> we know that server crashed-restarted, and also know the exact time,
> when.
>

I don't think it really solves the problem, though. For example if the
whole VM reboots (which can be a matter of seconds), this check will say
"shmem_init_time == pg_postmaster_start_time" and you've not detected
anything.

IMHO pg_postmaster_start_time is the right way to monitor uptime, and
the right way to detect spurious restarts is to remember the last value
you've seen and compare it to the current one.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-03-04 16:09:51 Re: Function to track shmem reinit time
Previous Message Tomas Vondra 2018-03-04 15:56:45 Re: Function to track shmem reinit time