Re: non-exclusive backup cleanup is mildly broken

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: robertmhaas(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: non-exclusive backup cleanup is mildly broken
Date: 2019-12-12 04:57:51
Message-ID: 20191212.135751.55206725964967977.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Wed, 11 Dec 2019 17:32:05 -0500, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in
> While reviewing the first patch in Asif Rehman's series of patches for
> parallel backup over at
> http://postgr.es/m/CADM=Jeg3ZN+kPQpiSfeWCXr=xgpLrq4cBQE5ZviUxygKq3VqiA@mail.gmail.com
> I discovered that commit 7117685461af50f50c03f43e6a622284c8d54694
> introduced a use of cancel_before_shmem_exit which falsifies the
> comment for that function. So you can cause a spurious WARNING in the
> logs by doing something like this, with max_prepared_transactions set
> to a non-zero value:
>
> select pg_start_backup('one', false, false);
> begin;
> prepare transaction 'nothing';
> select pg_stop_backup(false);
> \q
>
> in the server log:
> WARNING: aborting backup due to backend exiting before pg_stop_backup
> was called
>
> And you can cause an assertion failure like this:
>
> select pg_start_backup('one', false, false);
> begin;
> prepare transaction 'nothing';
> select pg_stop_backup(false);
> select pg_start_backup('two');
> \q
>
> We've discussed before the idea that it might be good to remove the
> limitation that before_shmem_exit() can only remove the
> most-recently-added callback, which would be one way to fix this
> problem, but I'd like to propose an alternative solution which I think
> will work out more nicely for the patch mentioned above: don't use
> cancel_before_shmem_exit, and just leave the callback registered for
> the lifetime of the backend. That requires some adjustment of the
> callback, since it needs to tolerate exclusive backup mode being in
> effect.
>
> The attached patch takes that approach. Thoughts welcome on (1) the
> approach and (2) whether to back-patch. I think there's little doubt
> that this is formally a bug, but it's a pretty minor one.

The direction seems reasonable, but the patch doesn't free up the
before_shmem_exec slot nor avoid duplicate registration of the
callback. Actually before_shmem_exit_list gets bloat with multiple
do_pg_abort_backup entries through repeated rounds of non-exclusive
backups.

As the result, if one ends a session while a non-exclusive backup is
active after closing the previous non-exclusive backup,
do_pg_abort_backup aborts for assertion failure.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-12-12 05:09:44 Re: Collation versioning
Previous Message asaba.takanori@fujitsu.com 2019-12-12 04:51:48 RE: Conflict handling for COPY FROM