Re: Sending notifications from the master to the standby

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sending notifications from the master to the standby
Date: 2012-01-10 17:37:14
Message-ID: CA+U5nMJswZLZFOWzdAjROrFMww6yfv39aJbweMDLx35GLew96Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 10, 2012 at 4:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> On Tue, Jan 10, 2012 at 5:00 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> It might be a bit tricky to get walreceivers to inject
>>> the data into the slave-side ring buffer at the right time, ie, not
>>> until after the commit a given message describes has been replayed;
>>> but I don't immediately see a reason to think that's infeasible.
>
>> [ Simon sketches a design for that ]
>
> Seems a bit overcomplicated.  I was just thinking of having walreceiver
> note the WAL endpoint at the instant of receipt of a notify message,
> and not release the notify message to the slave ring buffer until WAL
> replay has advanced that far.  You'd need to lay down ground rules about
> how the walsender times the insertion of notify messages relative to
> WAL in its output.

You have to store the messages somewhere until they're needed. If that
somewhere isn't on the standby, very close to the Startup process then
its going to be very slow. Putting a marker in the WAL stream
guarantees arrival order. The hash table was just a place to store
them until they're needed, could be a ring buffer as well.

Inserts into the slave ring buffer already have an xid on them, so the
test will probably already cope with messages inserted but for which
the parent xid has not committed. The only problem is coping with
possible out of sequence messages.

> But I don't see the need for either explicit markers
> in the WAL stream or a hash table.  Indeed, a hash table scares me
> because it doesn't clearly guarantee that notifies will be released in
> arrival order.

The hash table is clearly not the thing providing an arrival order
guarantee, it was just a cache.

You have a few choices: (1) you either send the message while holding
an exclusive lock, or (2) you send them as they come and buffer them,
then reorder them using the WAL log sequence since that matches the
original commit sequence. Or (3) add a sequence number to the messages
sent by WALSender, so that the WALReceiver can buffer them locally and
insert them in the correct order into the normal ring buffer - so in
(3) the message sequence and the WAL sequence match, but the mechanism
is different.

(1) is out because the purpose of offloading to the standby is to give
the master more capcity. If we slow it down in order to serve the
standby we're doing things the wrong way around.

I was choosing (2), maybe you prefer (3) or another design entirely.
They look very similar to me and about the same complexity, its just
copying data and preserving sequence.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan-Benedict Glaw 2012-01-10 17:54:06 Re: pgsphere
Previous Message Pavel Stehule 2012-01-10 17:34:36 Re: Add SPI results constants available for PL/*