Re: warning message in standby

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: warning message in standby
Date: 2010-06-14 11:42:55
Message-ID: AANLkTilQEdQ6kEKuyyQ2SQZLX1Tn-6fcLkvJPAkXt-T3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 14, 2010 at 7:18 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Mon, Jun 14, 2010 at 13:11, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> Magnus Hagander wrote:
>>> >> Seems like we need something like WARNING that doesn't cause the process
>>> >> to die, but more alarming like ERROR/FATAL/PANIC. Or maybe just adding a
>>> >> hint to the warning will do. How about
>>> >>
>>> >> WARNING: ?invalid record length at 0/4005330
>>> >> HINT: An invalid record was streamed from master. That can be a sign of
>>> >> corruption in the master, or inconsistency between master and standby
>>> >> state. The record will be re-fetched, but that is unlikely to fix the
>>> >> problem. You may have to restore standby from base backup.
>>> >
>>> > I am thinking about log monitoring tools like Nagios. ?I am afraid
>>> > they are never going to pick up something tagged WARNING, no matter
>>>
>>> If they are properly configured, I imagine they would. And if they're
>>> not, well, there's not much for us to do.
>>
>> What does that mean?
>
> It means that we can't prevent people from configuring their tools to
> ignore important warning. We can't prevent them rom ignoring ERROR or
> FATAL either...

Right. Certainly, ERROR would be better than WARNING, though, because
someone, somewhere out there has a log-fitering tool that extracts
ERRORs but ignore WARNINGs.

What still bugs me about this situation is that we're essentially
trying futilely to recover from what's really a fatal error. There is
no manner of proceeding that has any hope of success, yet we just keep
hopelessly retrying. Why do we do that here and not elsewhere? By
the logic we're using here, we ought to retry when we hit a division
by zero error. Maybe the next time we read the second input value it
will have some bits set...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-06-14 11:44:21 Re: pg_archive_bypass
Previous Message Bruce Momjian 2010-06-14 11:42:30 Re: warning message in standby