From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Assertion failure when the non-exclusive pg_stop_backup aborted. |
Date: | 2017-09-21 23:02:35 |
Message-ID: | CAB7nPqSxkT_qP67fDnVGyYEALW6G2K8bTkAAk3DcB4jtPZJ4cA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 21, 2017 at 5:56 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Thu, Sep 21, 2017 at 4:40 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Thu, Sep 21, 2017 at 2:25 PM, Michael Paquier
>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>> +- Assert(XLogCtl->Insert.nonExclusiveBackups >= 0);
>>> + if (XLogCtl->Insert.nonExclusiveBackups > 0)
>>> + XLogCtl->Insert.nonExclusiveBackups--;
>>> Hm, no, I don't agree. I think that instead you should just leave
>>> do_pg_abort_backup() immediately if sessionBackupState is set to
>>> SESSION_BACKUP_NONE. This variable is the link between the global
>>> counters and the session stopping the backup so I don't think that we
>>> should touch this assertion of this counter. I think that this method
>>> would be safe as well for backup start as pg_start_backup_callback
>>> takes care of any cleanup. Also because the counters are incremented
>>> before entering in the PG_ENSURE_ERROR_CLEANUP block, and
>>> sessionBackupState is updated just after leaving the block.
>>
>> I think that the assertion failure still can happen if the process
>> aborts after decremented the counter and before setting to
>> SESSION_BACKUP_NONE. Am I missing something?
>
> The process would stop at the next CHECK_FOR_INTERRUPTS() and trigger
> the cleanup at this moment. So this happens when waiting for the
> archives to be done, and the session flag is set to NONE at this
> point.
And actually, with two non-exclusive backups taken in parallel, it is
still possible to fail on another assertion in do_pg_stop_backup()
even with your patch. Imagine the following:
1) Two backups are taken, counter for non-exclusive backups is set at 2.
2) One backup is stopped, then interrupted. This causes the counter to
be decremented twice, once in do_pg_stop_backup, and once when
aborting. Counter is at 0, switching as well forcePageWrites to
false..
3) The second backup stops, a new assertion failure is triggered.
Without assertions the counter would get at -1.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Nikita Glukhov | 2017-09-22 00:03:50 | Re: compress method for spgist - 2 |
Previous Message | Tom Lane | 2017-09-21 22:26:50 | Re: [Proposal] Allow users to specify multiple tables in VACUUM commands |