Re: Streaming replication status

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication status
Date: 2010-01-09 07:25:31
Message-ID: 4B482F6B.80900@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii Masao wrote:
> On Sat, Jan 9, 2010 at 6:16 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> * If there's no WAL to send, walsender doesn't notice if the client has
>> closed connection already. This is the issue Fujii reported already.
>> We'll need to add a select() call to the walsender main loop to check if
>> the socket has been closed.
>
> We should reactivate pq_wait() and secure_poll()?

I don't think we need all that, a simple select() should be enough.
Though I must admit I'm not very familiar with select/poll().

>> * We still have a related issue, though: if standby is configured to
>> archive to the same location as master (as it always is on my laptop,
>> where I use the postgresql.conf of the master unmodified in the server),
>> right after failover the standby server will try to archive all the old
>> WAL files that were streamed from the master; but they exist already in
>> the archive, as the master archived them already. I'm not sure if this
>> is a pilot error, or if we should do something in the server to tell
>> apart WAL segments streamed from master and those generated in the
>> standby server after failover. Maybe we should immediately create a
>> .done file for every file received from master?
>
> There is no guarantee that such file has already been archived by master.
> This is just an idea, but new WAL record indicating the completion of the
> archiving would be useful for the standby to create .done file. But, this idea
> might kill the "archiving during recovery" idea discussed above.
>
> Personally, I'm OK with that issue because we can avoid it by tweaking
> archive_command. Could we revisit this discussion with the "archiving
> during recovery" discussion later?

Ok. The workaround is to configure standby to archive to a different
location. If you need to restore from that, you'll need to stitch
together the logs from the old master and the new one.

>> * A standby that connects to master, initiates streaming, and then sits
>> idle without stalls recycling of old WAL files in the master. That will
>> eventually lead to a full disk in master. Do we need some kind of a
>> emergency valve on that?
>
> I think that we need the GUC parameter to specify the maximum number
> of log file segments held in pg_xlog directory to send to the standby server.
> The replication to the standby which falls more than that GUC value behind
> is just terminated.
> http://archives.postgresql.org/pgsql-hackers/2009-12/msg01901.php

Oh yes, sounds good.

>> * Do we really need to split the sleep in walsender to NAPTIME_PER_CYCLE
>> increments?
>
> Yes. It's required for some platforms (probably HP-UX) in which signals
> cannot interrupt the sleep.

I'm thinking that the wal_sender_delay is so small that maybe it's not
worth worrying about.

>> * Walreceiver should flush less aggresively than after each received
>> piece of WAL as noted by XXX comment.
>
>> * XXX: Flushing after each received message is overly aggressive. Should
>> * implement some sort of lazy flushing. Perhaps check in the main loop
>> * if there's any more messages before blocking and waiting for one, and
>> * flush the WAL if there isn't, just blocking.
>
> In this approach, if messages continuously arrive from master, the fsync
> would be delayed until WAL segment is switched. Likewise, recovery also
> would be delayed, which seems to be problem.

That seems OK to me. If messages are really coming in that fast,
fsyncing the whole WAL segment at a time is probably most efficient.

But if that really is too much, you could still do extra flushes within
XLogRecv() every few megabytes for example.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gokulakannan Somasundaram 2010-01-09 08:05:03 Re: Index-only scans
Previous Message Fujii Masao 2010-01-09 06:53:48 Re: Streaming replication status