Re: Sync Rep v17

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, Daniel Farina <daniel(at)heroku(dot)com>
Subject: Re: Sync Rep v17
Date: 2011-02-24 13:08:34
Message-ID: AANLkTim1_yCOv29EB6uQpWbYhhqPYyC2MLd=nHQZGpRT@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 22, 2011 at 2:38 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> I've read about two-tenths of the patch, so I'll submit another comments
> about the rest later. Sorry for the slow reviewing...

Here are another comments:

+ {"synchronous_standby_names", PGC_SIGHUP, WAL_REPLICATION,
+ gettext_noop("List of potential standby names to synchronise with."),
+ NULL,
+ GUC_LIST_INPUT | GUC_IS_NAME

Why did you add GUC_IS_NAME here? I don't think that it's reasonable
to limit the length of this parameter to 63. Because dozens of standby
names might be specified in the parameter.

SyncRepQueue->qlock should be initialized by calling SpinLockInit?

+ * Portions Copyright (c) 2010-2010, PostgreSQL Global Development Group

Typo: s/2010/2011

sync_replication_timeout_client would mess up the "wait-forever" option.
So, when allow_standalone_primary is disabled, ISTM that
sync_replication_timeout_client should have no effect.

Please check max_wal_senders before calling SyncRepWaitForLSN for
non-replication case.

SyncRepRemoveFromQueue seems not to be as short-term as we can
use the spinlock. Instead, LW lock should be used there.

+ old_status = get_ps_display(&len);
+ new_status = (char *) palloc(len + 21 + 1);
+ memcpy(new_status, old_status, len);
+ strcpy(new_status + len, " waiting for sync rep");
+ set_ps_display(new_status, false);
+ new_status[len] = '\0'; /* truncate off " waiting" */

Updating the PS display should be skipped if update_process_title is false.

+ /*
+ * XXX extra code needed here to maintain sorted invariant.

Yeah, such a code is required. I think that we can shorten the time
it takes to find an insert position by searching the list backwards.
Because the given LSN is expected to be relatively new in the queue.

+ * Our approach should be same as racing car - slow in, fast out.
+ */

Really? Even when removing the entry from the queue, we need
to search the queue as well as we do in the add-entry case.
Why don't you make walsenders remove the entry from the queue,
instead?

+ long timeout = SyncRepGetWaitTimeout();
<snip>
+ bool timeout = false;
<snip>
+ else if (timeout > 0 &&
+ TimestampDifferenceExceeds(GetCurrentTransactionStopTimestamp(),
+ now, timeout))
+ {
+ release = true;
+ timeout = true;
+ }

You seem to mix up two "timeout" variables.

+ if (proc->lwWaitLink == MyProc)
+ {
+ /*
+ * Remove ourselves from middle of queue.
+ * No need to touch head or tail.
+ */
+ proc->lwWaitLink = MyProc->lwWaitLink;
+ }

When we find ourselves, we should break out of the loop soon,
instead of continuing the loop to the end?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2011-02-24 13:10:01 Re: pl/python tracebacks
Previous Message Tatsuo Ishii 2011-02-24 12:50:36 Re: Invitation to Cluster Hackers meeting at pgCon