From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Synchronous Log Shipping Replication |
Date: | 2008-09-09 09:54:08 |
Message-ID: | 48C647C0.4070905@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Simon Riggs wrote:
> Multiple backends waiting while we perform a write. Commits then happen
> as a group (to WAL at least), hence Group Commit.
The problem with our current commit protocol is this:
1. Backend A inserts commit record A
2. Backend A starts to flush commit record A
3. Backend B inserts commit record B
4. Backend B waits until 2. finishes
5. Backend B starts to flush commit record B
Note that we already have the logic to flush all pending commit records
at once. If there's also backend C that insert their commit records
after step 2, B and C will be flushed at once:
1. Backend A inserts commit record A
2. Backend A starts to flush commit record A
3. Backend B inserts commit record B
4. Backend B waits until 2. finishes
5. Backend C inserts commit record C
6. Backend C waits until 2. finishes
5. Flush A finishes. Backend B starts to flush commit records A+B
The idea of group commit is to insert a small delay in backend A between
steps 1 and 2, so that we can flush both commit records in one fsync:
1. Backend A inserts commit record A
2. Backend A waits
3. Backend B inserts commit record B
3. Backend B starts to flush commit record A + B
The tricky part is, how does A know if it should wait, and for how long?
commit_delay sure isn't ideal, but AFAICS the log shipping proposal
doesn't provide any solution to that.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark | 2008-09-09 10:20:26 | Re: Our CLUSTER implementation is pessimal |
Previous Message | Simon Riggs | 2008-09-09 09:35:35 | Re: Synchronous Log Shipping Replication |