Re: Back-branch update releases coming in a couple weeks

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Back-branch update releases coming in a couple weeks
Date: 2013-01-29 15:54:26
Message-ID: CAHGQGwECyjUzsffajCEjSsHKM72FeTuSnOpTufx7xAk=n94NXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 27, 2013 at 11:38 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
> From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
>>
>> On Sun, Jan 27, 2013 at 12:17 AM, MauMau <maumau307(at)gmail(dot)com> wrote:
>>>
>>> Although you said the fix will solve my problem, I don't feel it will.
>>> The
>>> discussion is about the crash when the standby "re"starts after the
>>> primary
>>> vacuums and truncates a table. On the other hand, in my case, the
>>> standby
>>> crashed during failover (not at restart), emitting a message that some
>>> WAL
>>> record refers to an "uninitialized" page (not a non-existent page) of an
>>> "index" (not a table).
>>>
>>> In addition, fujii_test.sh did not reproduce the mentioned crash on
>>> PostgreSQL 9.1.6.
>>>
>>> I'm sorry to cause you trouble, but could you elaborate on how the fix
>>> relates to my case?
>>
>>
>> Maybe I had not been understanding your problem correctly.
>> Could you show the self-contained test case which reproduces the problem?
>> Is the problem still reproducible in REL9_1_STABLE?
>
>
> As I said before, it's very hard to reproduce the problem. All what I did
> is to repeat the following sequence:
>
> 1. run "pg_ctl stop -mi" against the primary while the applications were
> performing INSERT/UPDATE/SELECT.
> 2. run "pg_ctl promote" against the standby of synchronous streaming
> standby.
> 3. run pg_basebackup on the stopped (original) primary to create a new
> standby, and start the new standby.
>
> I did this failover test dozens of times, probably more than a hundred. And
> I encountered the crash only once.
>
> Although I saw the problem only once, the result is catastrophic. So, I
> really wish Heiki's patch (in cooperation with Horiguchi-san and you) could
> fix the issue.
>
> Do you think of anything?

Umm... it's hard to tell whether your problem has been fixed in the latest
9.1, from that information. The bug fix which you mentioned consists of
two patches.

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7bffc9b7bf9e09ddeddc65117e49829f758e500d
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=970fb12de121941939e777764d6e0446c974bba3

The former seems not to be related to your problem because the problem
that patch fixed could basically happen only when restarting the standby.
The latter might be related to your problem....

Regards,

--
Fujii Masao

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2013-01-29 16:03:39 Re: pg_ctl idempotent option
Previous Message Ali Dar 2013-01-29 15:34:53 Re: missing rename support