Re: Issues with Quorum Commit

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issues with Quorum Commit
Date: 2010-10-08 02:24:35
Message-ID: AANLkTimTRDEjxPq1uYJ1JQR3F4--Vs5wJinunkePAaWu@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 6, 2010 at 6:00 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> In general, salvaging the WAL that was not sent to the standby yet is
> outright impossible. You can't achieve zero data loss with asynchronous
> replication at all.

No. That depends on the type of failure. Unless the disk in the master has
been corrupted, we might be able to salvage WAL.

>> If we want only no data loss, we have only to implement the wait-forever
>> option. But if we make consideration for the above-mentioned availability,
>> the return-immediately option also would be required.
>>
>> In some (many, I think) cases, I think that we need to consider
>> availability
>> and no data loss together, and consider the balance of them.
>
> If you need both, you need three servers as Simon pointed out earlier. There
> is no way around that.

No. That depends on how far you'd like to ensure no data loss.

Poeple who use shared disk failover solution with one master and one standby
don't such a high durability. They can avoid data loss by using something
like RAID to a certain extent. So it's not problem for them to run the master
alone after failover happens or standby goes down. But something like RAID
cannot increase availability. Synchronous replication is solution for that
purpose.

Of course, if we are worried about running the master alone, we can increase
the number of standbys. Furthermore, if we'd like to avoid data loss from the
disaster which can destroy all the servers at the same time, we might need to
increase the standbys further and locate some of them in the remote site.

Please imagine that "return-immediately (i.e., timeout = small)" is useful
for some use cases.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-10-08 02:35:56 Re: Issues with Quorum Commit
Previous Message Tom Lane 2010-10-08 02:08:37 Re: a few small bugs in plpgsql