Re: Synchronous Log Shipping Replication

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Synchronous Log Shipping Replication
Date: 2008-09-19 10:06:19
Message-ID: 1221818779.3913.2539.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tue, 2008-09-09 at 09:11 +0100, Simon Riggs wrote:

> This gives us the Group Commit feature also, even if we are not using
> replication. So we can drop the commit_delay stuff.
>
> XLogBackgroundFlush() processes data page at a time if it can. That may
> not be the correct batch size for XLogBackgroundSend(), so we may need a
> tunable for the MTU. Under heavy load we need the Write and Send to act
> in a way to maximise throughput rather than minimise response time, as
> we do now.
>
> If wal_buffers overflows, we continue to hold WALInsertLock while we
> wait for WALWriter and WALSender to complete.
>
> We should increase default wal_buffers to 64.
>
> After (or during) XLogInsert backends will sleep in a proc queue,
> similar to LWlocks and protected by a spinlock. When preparing to
> write/send the WAL process should read the proc at the *tail* of the
> queue to see what the next LogwrtRqst should be. Then it performs its
> action and wakes procs up starting with the head of the queue. We would
> add LSN into PGPROC, so WAL processes can check whether the backend
> should be woken. The LSN field can be accessed without spinlocks since
> it is only ever set by the backend itself and only read while a backend
> is sleeping. So we access spinlock, find tail, drop spinlock then read
> LSN of the backend that (was) the tail.

I left off mentioning one other aspect of "Group Commit" behaviour that
is possible with the above design.

If we use a proc queue, then the we only wake up the *first* backend on
the queue. That lets other WAL processes continue quickly.

Reason for doing this is that the first backend can walk the commit
queue collecting xids. When we update the ProcArray we can then update
multiple backend's entries with a single request, rather than forcing
all of the backends to form a queue all queueing for exclusive lock.

When the first backend has updated procarray, then all backends updated
will be released at once.

Doing it that way will significantly reduce the number of exclusive lock
requests for commits, which is the main source of contention on the
procarray.

So that puts in batch setting behaviour for WALWriteLock and
ProcArrayLock. And I'm submitting patch for batch setting of clog
entries around ClogControlLock. So we should get a scalability boost
from all of this.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-09-19 12:20:06 Re: Assert Levels
Previous Message Zdenek Kotala 2008-09-19 08:52:12 Re: Where to Host Project