Re: Streaming replication and non-blocking I/O

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication and non-blocking I/O
Date: 2009-12-16 09:53:49
Message-ID: 4B28AE2D.2030609@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii Masao wrote:
> On Tue, Dec 15, 2009 at 3:47 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Tom Lane wrote:
>>> The very, very large practical problem with this is that if you decide
>>> to change the behavior at any time, the only way to be sure that the WAL
>>> receiver is using the right libpq version is to perform a soname major
>>> version bump. The transformations done by libpq will essentially become
>>> part of its ABI, and not a very visible part at that.
>> Not having to change the libpq API would certainly be a big advantage.
>
> Done; I replaced PQgetXLogData and PQputXLogRecPtr with PQgetCopyData and
> PQputCopyData.

Great! The logical next step is move the handling of TimelineID and
system identifier out of libpq as well.

I'm thinking of refactoring the protocol along these lines:

0. Begin by connecting to the master just like a normal backend does. We
don't necessarily need the new ProtocolVersion code either, though it's
probably still a good idea to reject connections to older server versions.

1. Get the system identifier of the master.

Slave -> Master: Query message, with a query string like
"GET_SYSTEM_IDENTIFIER"

Master -> Slave: RowDescription, DataRow CommandComplete, and
ReadyForQuery messages. The system identifier is returned in the DataRow
message.

This is identical to what happens when a query is executed against a
normal backend using the simple query protocol, so walsender can use
PQexec() for this.

2. Another query exchange like above, for timeline ID. (or these two
steps can be joined into one query, to eliminate one round-trip).

3. Request a backup history file, if needed:

Slave -> Master: Query message, with a query string like
"GET_BACKUP_HISTORY_FILE XXX" where XXX is XLogRecPtr or file name.

Master -> Slave: RowDescription, DataRow CommandComplete and
ReadyForQuery messages as usual. The file contents are returned in the
DataRow message.

4. Start replication

Slave -> Master: Query message, with query string "START REPLICATION:
XXXX", where XXXX is the RecPtr of the starting point.

Master -> Slave: CopyOutResponse followed by a continuous stream of
CopyData messages with WAL contents.

This minimizes the changes to the protocol and libpq, with a clear way
of extending by adding new commands. Similar to what you did a long time
ago, connecting as an actual backend at first and then switching to
walsender mode after running a few queries, but this would all be
handled in a separate loop in walsender instead of running as a
full-blown backend. We'll still need small changes to libpq to allow
sending messages back to the server in COPY_IN mode (maybe add a new
COPY_IN_OUT mode for that).

Thoughts?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Meskes 2009-12-16 10:14:03 Re: ECPG patch N+1, fix auto-prepare
Previous Message Albe Laurenz 2009-12-16 09:52:34 Re: Update on true serializable techniques in MVCC