Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
Date: 2012-06-11 16:06:25
Message-ID: CA+TgmoZLWgr0StwDMxxmN2fmS57EX7vryQP3Pda5B6Ap0DbHPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Tue, Jun 5, 2012 at 4:51 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> We might want to have a different definition of apply delay for
> different purposes, so an improved definition of apply delay doesn't
> necessarily mean changing standby delay mechanism.
>
> An improved definition of apply delay would be, IMHO
> if (XLByteLE(receivePtr, replayPtr))
>    return 0;
> if (recoveryLastXTime > currentChunkStartTime)
>  then LastKnownTS = LastAppliedTS
>  else
>         LastKnownTS = StartChunkTS
> ApplyDelay = TimestampDifference(LastKnownTS, GetCurrentTimestamp()….);
>
> Which assumes the clocks are in sync. It also doesn't give very useful
> answers when no commits are occurring, and can hide the effects of
> large amounts of WAL generated by VACUUMs. So we need a better
> definition.

Another problem is that it sometimes subtracts two slave timestamps,
and sometimes subtracts a master timestamp from a slave timestamp. If
we're assuming that the clocks must be in sync, you could argue that's
OK, but I think it will lead to weird edge-case behavior.

Suppose that we have the master guarantee that at least one
timestamped WAL record will be emitted every N seconds. For the sake
of argument, let's say N = 5. So, every 5 seconds, some process wakes
up on the master and checks whether any commit or abort records - or
any other kind of WAL record that carries a timestamp - has been
emitted in the last 5 seconds. If so, then it does nothing. If not,
it checks whether any WAL at all has been emitted since the last
timestamped record was generated. If not, then it again does
nothing. But if so, then it emits a WAL record when consists solely
of a master timestamp.

On the slave, every time we reach a commit record, an abort record, or
one of these new master-timestamp records, or any other record that
happens to have a timestamp, we update some shared memory area which
stores (a) the last master timestamp we saw during replay and (b) the
slave timestamp at the time we replayed it. Apply delay (ignoring
time skew) can be calculated by subtracting the first value from the
second one, or we could expose the two values separately, which might
be even better, since users can then answer questions like "how long
has it been since we were able to recalculate the apply delay?".

I'm sure that at least one member of the audience will have some rocks
to throw at this proposal... fire away, but be gentle, since we are
all on the same team here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Alvaro Herrera 2012-06-11 16:20:13 Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3
Previous Message Magnus Hagander 2012-06-11 13:19:21 pgsql: Prevent non-streaming replication connections from being selecte

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2012-06-11 16:06:58 Re: [ADMIN] pg_basebackup blocking all queries with horrible performance
Previous Message Magnus Hagander 2012-06-11 15:47:00 Re: [ADMIN] pg_basebackup blocking all queries with horrible performance