Re: Sync Rep v17

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, Daniel Farina <daniel(at)heroku(dot)com>
Subject: Re: Sync Rep v17
Date: 2011-02-28 18:40:33
Message-ID: 1298918433.12992.1716.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2011-02-22 at 14:38 +0900, Fujii Masao wrote:
> On Mon, Feb 21, 2011 at 6:06 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > I've read about a tenth of the patch, so I'll submit another comments
> > about the rest later.
>
> Here are another comments:

Thanks for your comments
Code available at git://github.com/simon2ndQuadrant/postgres.git

> SyncRepReleaseWaiters should be called when walsender exits. Otherwise,
> if the standby crashes while a transaction is waiting for replication,
> it waits infinitely.

Will think on this.

> sync_rep_service and potential_sync_standby are not required to be in the
> WalSnd shmem because only walsender accesses them.

For use in debug, if not later monitoring

> +static bool
> +SyncRepServiceAvailable(void)
> +{
> + bool result = false;
> +
> + SpinLockAcquire(&WalSndCtl->ctlmutex);
> + result = WalSndCtl->sync_rep_service_available;
> + SpinLockRelease(&WalSndCtl->ctlmutex);
> +
> + return result;
> +}

Fixed

> volatile pointer needs to be used to prevent code rearrangement.
>
> + slock_t ctlmutex; /* locks shared variables shown above */
>
> cltmutex should be initialized by calling SpinLockInit.

Fixed

> + /*
> + * Stop providing the sync rep service, even if there are
> + * waiting backends.
> + */
> + {
> + SpinLockAcquire(&WalSndCtl->ctlmutex);
> + WalSndCtl->sync_rep_service_available = false;
> + SpinLockRelease(&WalSndCtl->ctlmutex);
> + }
>
> sync_rep_service_available should be set to false only when
> there is no synchronous walsender.

The way I had it is "correct" because "if (MyWalSnd->sync_rep_service)"
then if we're the sync walsender, so if we stop being it, then there
isn't one. A potential walsender might then become the sync walsender.

I think you'd like it if there was no gap at the point the potential wal
sender takes over? Just not sure how to do that robustly at present.
Will think some more.

> + /*
> + * When we first start replication the standby will be behind the primary.
> + * For some applications, for example, synchronous replication, it is
> + * important to have a clear state for this initial catchup mode, so we
> + * can trigger actions when we change streaming state later. We may stay
> + * in this state for a long time, which is exactly why we want to be
> + * able to monitor whether or not we are still here.
> + */
> + WalSndSetState(WALSNDSTATE_CATCHUP);
> +
>
> The above has already been committed. Please remove that from the patch.

Removed

> I don't like calling SyncRepReleaseWaiters for each feedback because
> I guess that it's too frequent. How about receiving all the feedbacks available
> from the socket, and then calling SyncRepReleaseWaiters as well as
> walreceiver does?

Possibly, but an optimisation for later when we have behaviour correct.

> + bool ownLatch; /* do we own the above latch? */
>
> We can just remove the ownLatch flag.

Agreed, removed

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-02-28 18:40:40 Re: Sync Rep v17
Previous Message Simon Riggs 2011-02-28 18:40:24 Re: Sync Rep v17