Re: Timeline in the light of Synchronous replication

From: fazool mein <fazoolmein(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Timeline in the light of Synchronous replication
Date: 2010-10-18 17:59:46
Message-ID: AANLkTik4zvqBf96BrU7NgVB5OYMbAHNwESr0CxF5VgQE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I believe we should come up with a universal solution that will solve
potential future problems as well (for example, if in sync replication, we
decide to send writes to standbys in parallel to writing on local disk).

The ideal thing would be to have an id that is incremented on every failure,
and is stored in the WAL. Whenever a standby connects to the primary, it
should send the point p in WAL where streaming should start, plus the id. If
the id is the same at the primary at point p, things are good. Else, we
should tell the standby to either create a new copy from scratch, or delete
some WALs.

@David
> One way to get them in sync without starting from scratch is to use
> rsync from A to B. This works in the asynchronous case, too. :)

The problem mainly is detecting when one can rsync/stream and when not.

Regards

On Mon, Oct 18, 2010 at 1:57 AM, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>wrote:

> Fujii Masao <masao(dot)fujii(at)gmail(dot)com> writes:
> > But, even though we will have done that, it should be noted that WAL in
> > A might be ahead of that in B. For example, A might crash right after
> > writing WAL to the disk and before sending it to B. So when we restart
> > the old master A as the standby after failover, we should need to delete
> > some WAL files (in A) which are inconsistent with the WAL sequence in B.
>
> The idea to send from master to slave the current last applied LSN has
> been talked about already. It would allow to send the WAL content in
> parallel of it's local fsync() on the master, the standby would refrain
> from applying any WAL segment until it knows the master is past that.
>
> Now, given such a behavior, that would mean that when A joins again as a
> standby, it would have to ask B for the current last applied LSN too,
> and would notice the timeline change. Maybe by adding a facility to
> request the last LSN of the previous timeline, and with the behavior
> above applied there (skipping now-known-future-WALs in recovery), that
> would work automatically?
>
> There's still the problem of WALs that have been applied before
> recovery, I don't know that we can do anything here. But maybe we could
> also tweak the CHECKPOINT mecanism not to advance the restart point
> until we know the standbys have already replayed anything up to the
> restart point?
>
> --
> Dimitri Fontaine
> http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Cédric Villemain 2010-10-18 18:05:31 Re: Creation of temporary tables on read-only standby servers
Previous Message Tom Lane 2010-10-18 17:57:44 Re: ISN patch that applies cleanly with git apply