Re: Streaming Replication patch for CommitFest 2009-09

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming Replication patch for CommitFest 2009-09
Date: 2009-09-24 10:57:22
Message-ID: 4ABB5092.8020502@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii Masao wrote:
> On Mon, Sep 21, 2009 at 4:51 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> - Can we replace read/write_conninfo with just a long-enough field in
>> shared mem? Would be simpler. (this is moot if we go with the
>> stand-alone walreceiver program and pass it as a command-line argument)
>
> Yes, if we can decide the length of conninfo. Since I could not decide
> that, I used read/write_conninfo to tell walreceiver the conninfo. Is the
> fixed size 1024B enough for conninfo?

Yeah, that should be plenty.

>> - walreceiver shouldn't die on connection error, just to be restarted by
>> startup process. Can we add error handling a la bgwriter and have a
>> retry loop within walreceiver? (again, if we go with a stand-alone
>> walreceiver program, it's probably better to have startup process
>> responsible to restart walreceiver, as it is now)
>
> Error handling a la bgwriter? You mean that PG_exception_stack
> should be set up to handle an ERROR exception?

Yep.

> Anyway, I'll change walreceiver to retry connecting to the primary
> after an error occurs in PQstartXLogStreaming()/PQgetXLogData()/
> PQputXLogRecPtr(). Should we set an upper limit of the number of
> the retries?

I don't think we need an upper limit.

>> - pq_wait in backend waits until you can read or write at least 1 byte.
>> There is no guarantee that you can send or read the whole message
>> without blocking. We'd have to put the socket in non-blocking mode for
>> that. I'm not sure what the implications of this are.
>
> Umm... AFAIK, poll and select guarantee that at least the subsequent
> recv will not be blocked. If there is only 1 byte available in the buffer,
> recv would read that 1 byte and return immediately. I'm not sure if send
> will get stuck even after poll is passed. In my environment (RHEL5),
> send seems not to be blocked.

Hmm, I guess you're right.

>> - I know I said we should have just asynchronous replication at first,
>> but looking ahead, how would you do synchronous?
>
> As the previous patch did, I'm going to make walsender read the latest
> XLOG from wal_buffers, introduce the signaling between a backend
> and walsender, and keep a backend waiting until the specified XLOG
> has been written or fsynced in the standby.

Ok. I don't think walsender needs to access wal_buffers even then,
though. Once the backend has written the WAL, walsender can well read it
from disk (it will surely be in OS cache still).

>> What kind of signaling
>> is needed between walreceiver and startup process for that?
>
> I was thinking that the synchronization mode which a client waits
> until XLOG has been applied is not necessary right now, so no
> signaling is also not required between those processes yet. But,
> HS requires this capability?

Yeah, I think it will be important with hot standby. It's a much more
useful guarantee that once COMMIT returns, the transactions is visible
in the standby, than that it's merely fsync'd to disk in the standby.

(don't need to solve it now, let's do just asynchronous mode now, but
it's something to keep in mind)

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2009-09-24 11:03:46 Re: Streaming Replication patch for CommitFest 2009-09
Previous Message Fujii Masao 2009-09-24 10:55:49 Re: Streaming Replication patch for CommitFest 2009-09