| From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> | 
|---|---|
| To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> | 
| Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Markus Wanner <markus(at)bluegap(dot)ch>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: Synchronous Log Shipping Replication | 
| Date: | 2008-09-09 11:38:01 | 
| Message-ID: | 48C66019.40400@enterprisedb.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Fujii Masao wrote:
> What makes the sender process bottleneck?
The keyword here is "might". There's many possibilities, like:
- Slow network.
- Ridiculously fast disk. Like a RAM disk. If you have a synchronous 
slave you can fail over to, putting WAL on a RAM disk isn't that crazy.
- slower WAL disk on the slave.
etc.
>> Backends then wait
>> * not at all for asynch commit
>> * just for Write for local synch commit
>> * for both Write and Send for remote synch commit
>> (various additional options for what happens to confirm Send)
> 
> I'd like to introduce new parameter "synchronous_replication" which specifies
> whether backends waits for the response from WAL sender process. By
> combining synchronous_commit and synchronous_replication, users can
> choose various options.
There's one thing I haven't figured out in this discussion. Does the 
write to the disk happen before or after the write to the slave? Can you 
guarantee that if a transaction is committed in the master, it's also 
committed in the slave, or vice versa?
>> Another thought occurs that we might measure the time a Send takes and
>> specify a limit on how long we are prepared to wait for confirmation.
>> Limit=0 => asynchronous. Limit > 0 implies synchronous-up-to-the-limit.
>> This would give better user behaviour across a highly variable network
>> connection.
> 
> In the viewpoint of detection of a network failure, this feature is necessary.
> When the network goes down, WAL sender can be blocked until it detects
> the network failure, i.e. WAL sender keeps waiting for the response which
> never comes. A timeout notification is necessary in order to detect a
> network failure soon.
Agreed. But what happens if you hit that timeout? Should we enforce that 
timeout within the server, or should we leave that to the external 
heartbeat system?
-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Markus Wanner | 2008-09-09 11:42:50 | Re: Synchronous Log Shipping Replication | 
| Previous Message | Markus Wanner | 2008-09-09 11:22:06 | Re: Synchronous Log Shipping Replication |