Re: Switching timeline over streaming replication

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Heikki Linnakangas'" <hlinnakangas(at)vmware(dot)com>
Cc: "'PostgreSQL-development'" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Switching timeline over streaming replication
Date: 2012-10-03 15:15:16
Message-ID: 001601cda179$e55e9c30$b01bd490$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote:
> Thanks for the thorough review! I committed the xlog.c refactoring patch
> now. Attached is a new version of the main patch, comments on specific
> points below. I didn't adjust the docs per your comments yet, will do
> that next.

I have some doubts regarding the comments fixed by you and some more new
review comments.
After this I shall focus majorly towards testing of this Patch.

> On 01.10.2012 05:25, Amit kapila wrote:
> > 1. In function readTimeLineHistory(),
> > two mechanisms are used to fetch timeline from history file
> > + sscanf(fline, "%u\t%X/%X", &tli, &switchpoint_hi,
> > &switchpoint_lo);
> > +
>
> > 8. In function writeTimeLineHistoryFile(), will it not be better to
> > directly write rather than to later do pg_fsync().
> > as it is just one time write.
>
> Not sure I understood this right, but writeTimeLineHistoryFile() needs
> to avoid putting a corrupt, e.g incomplete, file in pg_xlog. The same as
> writeTimeLineHistory(). That's why the write+fsync+rename dance is
> needed.
>
Why fsync, isn't the above purpose be resolved if write directly writes to
file and then rename.

> > 21. @@ -2411,27 +2411,6 @@ reaper(SIGNAL_ARGS)
> >
> > a. won't it impact stop of online basebackup functionality
> because earlier on promotion
> > I think this code will stop walsenders and basebackup stop
> will also give error in such cases.
>
> Hmm, should a base backup be aborted when the standby is promoted? Does
> the promotion render the backup corrupt?

I think currently it does so. Pls refer
1.
do_pg_stop_backup(char *labelfile, bool waitforarchive)
{
..
if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
ereport(ERROR,

(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("the standby was promoted during
online backup"),
errhint("This means that the backup being
taken is corrupt "
"and should not be used. "
"Try taking another online
backup.")));
..

}

2. In documentation of pg_basebackup there is a Note:
.If the standby is promoted to the master during online backup, the backup
fails.

New Ones
---------------
35.WalSenderMain(void)
{
..
+ if (walsender_shutdown_requested)
+ ereport(FATAL,
+ (errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("terminating replication
connection due to administrator command")));
+
+ /* Tell the client that we are ready to receive commands */

+ ReadyForQuery(DestRemote);
+
..

+ if (walsender_shutdown_requested)
+ ereport(FATAL,
+ (errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("terminating replication
connection due to administrator command")));
+

is it necessary to check walsender_shutdown_requested 2 times in a loop, if
yes, then
can we write comment why it is important to check it again.

35. +SendTimeLineHistory(TimeLineHistoryCmd *cmd)
{
..
+ fd = PathNameOpenFile(path, O_RDONLY | PG_BINARY, 0666);

error handling for fd < 0 is missing.

36.+SendTimeLineHistory(TimeLineHistoryCmd *cmd)
{
..
if (nread <= 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not read file
\"%s\": %m",
+ path)));

FileClose should be done in error case as well.

37. static void
XLogSend(char *msgbuf, bool *caughtup)
{
..
if (currentTimeLineIsHistoric && XLByteLE(currentTimeLineValidUpto,
sentPtr))
{
/*
* This was a historic timeline, and we've reached
the point where
* we switched to the next timeline. Let the client
know we've
* reached the end of this timeline, and what the
next timeline is.
*/
/* close the current file. */
if (sendFile >= 0)
close(sendFile);
sendFile = -1;
*caughtup = true;

/* Send CopyDone */
pq_putmessage_noblock('c', NULL, 0);
streamingDoneSending = true;
return;
}
}

I am not able to understand after sending CopyDone message from above code,
how walreceiver is handling it and then replying it a CopyDone message.
Basically I want to know the walreceiver code which handles it?

38.
static void
WalSndLoop(void)
{
@@ -756,18 +898,24 @@ WalSndLoop(void)

/* Normal exit from the walsender is here */
if (walsender_shutdown_requested)
- {
- /* Inform the standby that XLOG streaming is done
*/
- pq_puttextmessage('C', "COPY 0");
- pq_flush();
-
- proc_exit(0);
- }
+ ereport(FATAL,
+ (errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("terminating replication
connection due to administrator command")));

What is the reason of removal of sending above message to standby when
shutdown was requested?

39. WalSndLoop(void)
{
..
/* If nothing remains to be sent right now ... */
if (caughtup && !pq_is_send_pending())
{
/*
* If we're in catchup state, move to streaming.
This is an
* important state change for users to know about,
since before
* this point data loss might occur if the primary
dies and we
* need to failover to the standby. The state change
is also
* important for synchronous replication, since
commits that
* started to wait at that point might wait for some
time.
*/
if (MyWalSnd->state == WALSNDSTATE_CATCHUP)
{
ereport(DEBUG1,
(errmsg("standby \"%s\" has now
caught up with primary",

application_name)));
WalSndSetState(WALSNDSTATE_STREAMING);
}
..
}

After new implementation, I think above if loop [if (caughtup &&
!pq_is_send_pending())] can be true
when the standby has not actually caught up as now it sends the data from
previous timelines.

With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-10-03 15:35:31 Re: PQping command line tool
Previous Message Andres Freund 2012-10-03 14:52:17 Re: Support for REINDEX CONCURRENTLY