Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-08-07 10:16:13
Message-ID: CAA4eK1JXSFzBMVsBoOhSqBSXHE1Nu5Mdr2u+Smrc0Sh+a11aug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 7, 2023 at 1:06 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Mon, Aug 07, 2023 at 12:42:33PM +0530, Amit Kapila wrote:
> > On Mon, Aug 7, 2023 at 11:29 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > >
> > > Unless I'm missing something I don't see what prevents something to connect
> > > using the replication protocol and issue any query or even create new
> > > replication slots?
> > >
> >
> > I think the point is that if we have any slots where we have not
> > consumed the pending WAL (other than the expected like
> > SHUTDOWN_CHECKPOINT) or if there are invalid slots then the upgrade
> > won't proceed and we will request user to remove such slots or ensure
> > that WAL is consumed by slots. So, I think in the case you mentioned,
> > the upgrade won't succeed.
>
> What if new slots are added while the old instance is started in the middle of
> pg_upgrade, *after* the various checks are done?
>

They won't be copied but I think that won't be any different than
other objects like tables. Anyway, I have another idea which is to not
allow creating slots during binary upgrade unless one specifically
requests it by having an API like binary_upgrade_allow_slot_create()
similar to existing APIs binary_upgrade_*.

> > > Note also that as complained a few years ago nothing prevents a bgworker from
> > > spawning up during pg_upgrade and possibly corrupt the upgraded cluster if
> > > multixid are assigned. If publications are preserved wouldn't it mean that
> > > such bgworkers could also lead to data loss?
> > >
> >
> > Is it because such workers would write some WAL which slots may not
> > process? If so, I think it is equally dangerous as other problems that
> > can arise due to such a worker. Do you think of any special handling
> > here?
>
> Yes, and there were already multiple reports of multixact corruption due to
> bgworker activity during pg_upgrade (see
> https://www.postgresql.org/message-id/20210121152357.s6eflhqyh4g5e6dv@dalibo.com
> for instance). I think we should once and for all fix this whole class of
> problem one way or another.
>

I don't object to doing something like we discussed in the thread you
linked but don't see the link with this work. Surely, the extra
WAL/XIDs generated during the upgrade will cause data inconsistency
which is no different after this patch.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2023-08-07 10:17:38 Re: pgbench: allow to exit immediately when any client is aborted
Previous Message Peter Eisentraut 2023-08-07 10:01:25 Re: Minor configure/meson cleanup