Re: non-exclusive backup cleanup is mildly broken

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: non-exclusive backup cleanup is mildly broken
Date: 2019-12-12 12:52:31
Message-ID: CABUevEz5AzHfZmE2fNEj+6WwdNP5B6X0DNc1d6orHkw5ccrrYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 12, 2019 at 5:58 AM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
wrote:

> Hello.
>
> At Wed, 11 Dec 2019 17:32:05 -0500, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote in
> > While reviewing the first patch in Asif Rehman's series of patches for
> > parallel backup over at
> >
> http://postgr.es/m/CADM=Jeg3ZN+kPQpiSfeWCXr=xgpLrq4cBQE5ZviUxygKq3VqiA@mail.gmail.com
> > I discovered that commit 7117685461af50f50c03f43e6a622284c8d54694
> > introduced a use of cancel_before_shmem_exit which falsifies the
> > comment for that function. So you can cause a spurious WARNING in the
> > logs by doing something like this, with max_prepared_transactions set
> > to a non-zero value:
> >
> > select pg_start_backup('one', false, false);
> > begin;
> > prepare transaction 'nothing';
> > select pg_stop_backup(false);
> > \q
> >
> > in the server log:
> > WARNING: aborting backup due to backend exiting before pg_stop_backup
> > was called
> >
> > And you can cause an assertion failure like this:
> >
> > select pg_start_backup('one', false, false);
> > begin;
> > prepare transaction 'nothing';
> > select pg_stop_backup(false);
> > select pg_start_backup('two');
> > \q
> >
> > We've discussed before the idea that it might be good to remove the
> > limitation that before_shmem_exit() can only remove the
> > most-recently-added callback, which would be one way to fix this
> > problem, but I'd like to propose an alternative solution which I think
> > will work out more nicely for the patch mentioned above: don't use
> > cancel_before_shmem_exit, and just leave the callback registered for
> > the lifetime of the backend. That requires some adjustment of the
> > callback, since it needs to tolerate exclusive backup mode being in
> > effect.
> >
> > The attached patch takes that approach. Thoughts welcome on (1) the
> > approach and (2) whether to back-patch. I think there's little doubt
> > that this is formally a bug, but it's a pretty minor one.
>
> The direction seems reasonable, but the patch doesn't free up the
> before_shmem_exec slot nor avoid duplicate registration of the
> callback. Actually before_shmem_exit_list gets bloat with multiple
> do_pg_abort_backup entries through repeated rounds of non-exclusive
> backups.
>
> As the result, if one ends a session while a non-exclusive backup is
> active after closing the previous non-exclusive backup,
> do_pg_abort_backup aborts for assertion failure.
>

+1, I also think the direction seems perfectly reasonable, but we should
avoid re-adding the callback since we're not removing it. Leaving it around
seems cheap enough as long as there is only one.

My first reaction would be to just disallow the combination of prepared
transactions and start/stop backups. But lookingat it it seems like an
unnecessray restriction and an approach like this one seems better.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-12-12 13:25:50 Re: Wrong assert in TransactionGroupUpdateXidStatus
Previous Message Kyotaro Horiguchi 2019-12-12 12:40:57 Re: Append with naive multiplexing of FDWs