Re: emergency outage requiring database restart

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: emergency outage requiring database restart
Date: 2016-10-26 17:57:22
Message-ID: CAHyXU0wC+sWBBcepCxkASO-hUEidK1ACbK1cqDXzE6GKDb7Spg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 26, 2016 at 12:43 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Wed, Oct 26, 2016 at 11:35 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Tue, Oct 25, 2016 at 3:08 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>>> Confirmation of problem re-occurrence will come in a few days. I'm
>>> much more likely to believe 6+sigma occurrence (storage, freak bug,
>>> etc) should it prove the problem goes away post rebuild.
>>
>> ok, no major reported outage yet, but just got:
>>
>> 2016-10-26 11:27:55 CDT [postgres(at)castaging]: ERROR: invalid page in
>> block 12 of relation base/203883/1259

*) I've now strongly correlated this routine with the damage.

[root(at)rcdylsdbmpf001 ~]# cat
/var/lib/pgsql/9.5/data/pg_log/postgresql-26.log | grep -i
pushmarketsample | head -5
2016-10-26 11:26:27 CDT [postgres(at)castaging]: LOG: execute <unnamed>:
SELECT PushMarketSample($1::TEXT) AS published
2016-10-26 11:26:40 CDT [postgres(at)castaging]: LOG: execute <unnamed>:
SELECT PushMarketSample($1::TEXT) AS published
PL/pgSQL function pushmarketsample(text,date,integer) line 103 at SQL statement
PL/pgSQL function pushmarketsample(text,date,integer) line 103 at SQL statement
2016-10-26 11:26:42 CDT [postgres(at)castaging]: STATEMENT: SELECT
PushMarketSample($1::TEXT) AS published

*) First invocation was 11:26:27 CDT

*) Second invocation was 11:26:40 and gave checksum error (as noted
earlier 11:26:42)

*) Routine attached (if interested)

My next step is to set up test environment and jam this routine
aggressively to see what happens.

merlin

Attachment Content-Type Size
PushMarketSample.sql text/x-sql 11.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-10-26 17:59:31 Re: [BUG] pg_basebackup from disconnected standby fails
Previous Message Josh Berkus 2016-10-26 17:55:50 Re: Default setting for autovacuum_freeze_max_age