Re: Re: [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200)

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: masao(dot)fujii(at)gmail(dot)com
Cc: michael(dot)paquier(at)gmail(dot)com, seltenreich(at)gmx(dot)de, pgsql-hackers(at)postgresql(dot)org, magnus(at)hagander(dot)net
Subject: Re: Re: [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200)
Date: 2016-12-13 10:15:32
Message-ID: 20161213.191532.229001209.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the comment.

At Tue, 13 Dec 2016 12:49:12 +0900, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote in <CAHGQGwHPZ8mwx=FcAxB2kaydhBjM-di7mxy6C2B3V5oWtddp_Q(at)mail(dot)gmail(dot)com>
> On Tue, Nov 8, 2016 at 11:18 AM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >> By the way, as long as I look at that, the patch applies cleanly on
> >> master and 9.6 but not further down. If a committer shows up to push
> >> this patch, I can prepare versions for back branches as needed.
> >
> > Ok, the attached is the version just rebased to the current
> > master. It is clieanly appliable without whitespace errors on
> > master and 9.6 with some displacements but fails on 9.5.
> >
> > I will mark this as "Ready for Committer".
>
> Thanks for updating the patch! Here are the review comments;
>
> + * EXCLUSIVE_BACKUP_NONE means that there is no exclusive backup actually
> + * running, to be more precise pg_start_backup() is not being executed for
> + * an exclusive backup and there is exclusive backup in progress.
>
> "there is exclusive" should be "there is no exclusive"?
> ISTM that the description following "to be more" doesn't seem to be necessary.

I agree. One can know that by reading the description on
EXCLUSIVE_BACKUP_STARTING (and STOPPING, which is not mentioned here)

> + * EXCLUSIVE_BACKUP_STARTING means that pg_start_backup() is running and
> + * that is is not done yet.
> + * EXCLUSIVE_BACKUP_STOPPING means that pg_stop_backup() is running and
> + * that is is not done yet.
>
> Typo: "is is"

Oops. Sorry for overlooking such a silly typo.

> Isn't it better to mention "an exclusive backup" here? What about
>
> EXCLUSIVE_BACKUP_STARTING means that pg_start_backup() is starting an exclusive
> backup.
> EXCLUSIVE_BACKUP_STOPPING means that pg_stop_backup() is stopping an exclusive
> backup.

It seems fine. Done in the attached version.

> I think that it's worth explaining why ExclusiveBackupState is necessary,
> in the comment.

The attached version contains the following comment.

| * State of an exclusive backup.
| *
| * EXCLUSIVE_BACKUP_STARTING and EXCLUSIVE_BACKUP_STOPPING blocks
| * pg_start_backup and pg_stop_backup to prevent a race condition. Both of the
| * functions returns error on these state.

Does this make sense?

> - * exclusiveBackup is true if a backup started with pg_start_backup() is
> - * in progress, and nonExclusiveBackups is a counter indicating the number
> + * exclusiveBackupState is changed to EXCLUSIVE_BACKUP_STARTING for the
> + * duration of pg_start_backup(), then to EXCLUSIVE_BACKUP_IN_PROGRESS once
> + * it is done, and nonExclusiveBackups is a counter indicating the number
>
> Why didn't you mention EXCLUSIVE_BACKUP_STOPPING and _NONE?
> Basically it's better to explain fully how the state changes.

Hmm, it is modfied as the following in the attached.

| * exclusiveBackupState indicates state of an exlusive backup, see
| * ExclusiveBackupState for the detail.

and the comment for ExclusiveBackupState explains the state
transition. Does this work for you?

> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("a backup is already starting")));
> + }
> + if (XLogCtl->Insert.exclusiveBackupState == EXCLUSIVE_BACKUP_STOPPING)
> + {
> + WALInsertLockRelease();
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("a backup is currently stopping")));
>
> Isn't it better to use "an exclusive backup" explicitly rather than
> "a backup" here?

pg_stop_backup already says like that. To unify with them, the
messages of pg_start_backup is modifed as the following.

exlusive backup is already starting
exlusive backup is currently stopping
exlusive backup is currently in progress

> You changed the code so that pg_stop_backup() removes backup_label before
> it marks the exclusive-backup-state as not-in-progress. Could you tell me
> why you did this change? It's better to explain it in the comment.

Previously the backup_label is the traffic signal for the backup
state and that is the cause of this problem. Now the
exclusiveBackupState takes the place of it, so it naturally
should be _NONE after all steps have been finished. It seems to
me so natural so that no additional explanation is needed.

If they are done in the opposite order, the existence check of
backup_label still left in pg_start_backup may fire.

| * Check for existing backup label --- implies a backup is already
| * running. (XXX given that we checked exclusiveBackupState above,
| * maybe it would be OK to just unlink any such label file?)
| */
| if (stat(BACKUP_LABEL_FILE, &stat_buf) != 0)
| {
| if (errno != ENOENT)
| ereport(ERROR,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
base-backup-crash-v6.patch text/x-patch 11.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ian Jackson 2016-12-13 11:30:28 Re: [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation [and 2 more messages]
Previous Message Kyotaro HORIGUCHI 2016-12-13 10:06:00 Re: Typmod associated with multi-row VALUES constructs