Re: Back-branch update releases coming in a couple weeks

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Back-branch update releases coming in a couple weeks
Date: 2013-01-27 14:38:45
Message-ID: CC75D21C021C40739F57787EF1BA29E1@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
> On Sun, Jan 27, 2013 at 12:17 AM, MauMau <maumau307(at)gmail(dot)com> wrote:
>> Although you said the fix will solve my problem, I don't feel it will.
>> The
>> discussion is about the crash when the standby "re"starts after the
>> primary
>> vacuums and truncates a table. On the other hand, in my case, the
>> standby
>> crashed during failover (not at restart), emitting a message that some
>> WAL
>> record refers to an "uninitialized" page (not a non-existent page) of an
>> "index" (not a table).
>>
>> In addition, fujii_test.sh did not reproduce the mentioned crash on
>> PostgreSQL 9.1.6.
>>
>> I'm sorry to cause you trouble, but could you elaborate on how the fix
>> relates to my case?
>
> Maybe I had not been understanding your problem correctly.
> Could you show the self-contained test case which reproduces the problem?
> Is the problem still reproducible in REL9_1_STABLE?

As I said before, it's very hard to reproduce the problem. All what I did
is to repeat the following sequence:

1. run "pg_ctl stop -mi" against the primary while the applications were
performing INSERT/UPDATE/SELECT.
2. run "pg_ctl promote" against the standby of synchronous streaming
standby.
3. run pg_basebackup on the stopped (original) primary to create a new
standby, and start the new standby.

I did this failover test dozens of times, probably more than a hundred. And
I encountered the crash only once.

Although I saw the problem only once, the result is catastrophic. So, I
really wish Heiki's patch (in cooperation with Horiguchi-san and you) could
fix the issue.

Do you think of anything?

Regards
MauMau

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2013-01-27 16:23:02 Re: Cascading replication: should we detect/prevent cycles?
Previous Message Michael Meskes 2013-01-27 12:54:02 Re: [PATCH]Fix for ecpglib's native language messages output