Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-06-28 10:53:38
Message-ID: CAA4eK1JMb8zf1TfdG4bkYuVEczkoKM4YYYK8bXG1ARkg24nTkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 14, 2023 at 4:00 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> > Sorry for the delay, I didn't had time to come back to it until this afternoon.
>
> No issues, everyone is busy:-).
>
> > I don't think that your analysis is correct. Slots are guaranteed to be
> > stopped after all the normal backends have been stopped, exactly to avoid such
> > extraneous records.
> >
> > What is happening here is that the slot's confirmed_flush_lsn is properly
> > updated in memory and ends up being the same as the current LSN before the
> > shutdown. But as it's a logical slot and those records aren't decoded, the
> > slot isn't marked as dirty and therefore isn't saved to disk. You don't see
> > that behavior when doing a manual checkpoint before (per your script comment),
> > as in that case the checkpoint also tries to save the slot to disk but then
> > finds a slot that was marked as dirty and therefore saves it.
> >

Here, why the behavior is different for manual and non-manual checkpoint?

> > In your script's scenario, when you restart the server the previous slot data
> > is restored and the confirmed_flush_lsn goes backward, which explains those
> > extraneous records.
>
> So you meant to say that the key point was that some records which are not sent
> to subscriber do not mark slots as dirty, hence the updated confirmed_flush was
> not written into slot file. Is it right? LogicalConfirmReceivedLocation() is called
> by walsender when the process gets reply from worker process, so your analysis
> seems correct.
>

Can you please explain what led to updating the confirmed_flush in
memory but not in the disk? BTW, have we ensured that discarding the
additional records are already sent to the subscriber, if so, why for
those records confirmed_flush LSN is not progressed?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-06-28 10:58:08 Re: Making empty Bitmapsets always be NULL
Previous Message Laurenz Albe 2023-06-28 10:46:58 Re: Assistance Needed: Issue with pg_upgrade and --link option