Re: Critical failure of standby

From: James Sewell <james(dot)sewell(at)jirotech(dot)com>
To: Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
Cc: John R Pierce <pierce(at)hogranch(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical failure of standby
Date: 2016-08-16 07:11:10
Message-ID: CAANVwEvu6ZSPC-9eKbaohGry_cWLMcmHjx-BVWZ4T+xEk6o1+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hey Sameer,

As per the logs there was a crash of one standby, which seems to have
corrupted that standby and the two cascading standby.

- No backups
- Full page writes enabled
- Fsync enabled

Cheers,

James Sewell,
Solutions Architect

Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
*P *(+61) 2 8099 9000 <(+61)%202%208099%209000> *W* www.jirotech.com *F *
(+61) 2 8099 9099 <(+61)%202%208099%209000>

On Tue, Aug 16, 2016 at 3:15 PM, Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
wrote:

>
>
> On Tue, Aug 16, 2016 at 1:10 PM James Sewell <james(dot)sewell(at)jirotech(dot)com>
> wrote:
>
>> Hey,
>>
>> I understand that.
>>
>> But a hot standby should always be ready to promote (given it originally
>> caught up) right?
>>
>> I think it's a moot point really as some sort of corruption has been
>> introduced, the machines still won't wouldn't start after they could see
>> the archive destination again.
>>
>
> Did you had a pending basebackup on the standby or a start backup (with no
> matching stop backup)?
>
> Was there a crash/immediate shutdown on the standby during this network
> outage? Do you have full page writes/fsync disabled?
>
>
>>
>> Cheers,
>>
>> James Sewell,
>> Solutions Architect
>>
>>
>>
>> Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
>> *P *(+61) 2 8099 9000 <(+61)%202%208099%209000> *W* www.jirotech.com
>> *F *(+61) 2 8099 9099 <(+61)%202%208099%209000>
>>
>> On Tue, Aug 16, 2016 at 12:36 PM, John R Pierce <pierce(at)hogranch(dot)com>
>> wrote:
>>
>>> On 8/15/2016 7:23 PM, James Sewell wrote:
>>>
>>>> Those are all good questions.
>>>>
>>>> Essentially this is a situation where DR is network separated from Prod
>>>> - so I would expect the archive command to fail. I'll have to check the
>>>> script it must not be passing the error back through to PostgreSQL.
>>>>
>>>> This still shouldn't cause database corruption though right? - it's
>>>> just not getting WALs.
>>>>
>>>
>>> if the slave database is asking for the WAL's, it needs them. if it
>>> needs them and can't get them, then it can't catch up and start.
>>>
>>>
>>>
>>> --
>>> john r pierce, recycling bits in santa cruz
>>>
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>> ------------------------------
>> The contents of this email are confidential and may be subject to legal
>> or professional privilege and copyright. No representation is made that
>> this email is free of viruses or other defects. If you have received this
>> communication in error, you may not copy or distribute any part of it or
>> otherwise disclose its contents to anyone. Please advise the sender of your
>> incorrect receipt of this correspondence.
>
> --
> --
> Best Regards
> Sameer Kumar | DB Solution Architect
> *ASHNIK PTE. LTD.*
>
> 101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533
>
> T: +65 6438 3504 | M: +65 8110 0350
>
> Skype: sameer.ashnik | www.ashnik.com
>

--

------------------------------
The contents of this email are confidential and may be subject to legal or
professional privilege and copyright. No representation is made that this
email is free of viruses or other defects. If you have received this
communication in error, you may not copy or distribute any part of it or
otherwise disclose its contents to anyone. Please advise the sender of your
incorrect receipt of this correspondence.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Achilleas Mantzios 2016-08-16 07:38:16 Re: Uber migrated from Postgres to MySQL
Previous Message dandl 2016-08-16 06:24:27 Re: C++ port of Postgres