Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record

From: Andres Freund <andres(at)anarazel(dot)de>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record
Date: 2026-03-10 17:11:09
Message-ID: vzguaguldbcyfbyuq76qj7hx5qdr5kmh67gqkncyb2yhsygrdt@dfhcpteqifux
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-03-06 16:48:06 +0900, Fujii Masao wrote:
> On Fri, Mar 6, 2026 at 8:46 AM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> >
> > On Thu, Mar 5, 2026 at 5:40 PM Anthonin Bonnefoy
> > <anthonin(dot)bonnefoy(at)datadoghq(dot)com> wrote:
> > > So it was relying on GetInsertRecPtr() instead of
> > > GetXLogInsertRecPtr(). As mentioned in the thread, GetInsertRecPtr()
> > > only returns the position of the last full xlog page, meaning it
> > > doesn't fix the issue we have where the last partial page contains a
> > > continuation record.
> > >
> > > Testing the XLogFlush(GetInsertRecPtr()) patch with my script, I still
> > > get the shutdown stuck issue.
> > >
> > > Using GetXLogInsertRecPtr() is required to make sure the last partial
> > > page is correctly flushed.
> >
> > Since GetXLogInsertRecPtr() returns a bogus LSN and XLogFlush() does
> > almost nothing during recovery, I added a !RecoveryInProgress() check
> > as follows. I've attached the latest version of the patch and updated
> > the commit message.
> >
> > - if (got_STOPPING)
> > - XLogBackgroundFlush();
> > + if (got_STOPPING && !RecoveryInProgress())
> > + XLogFlush(GetXLogInsertRecPtr());
>
> I've pushed the patch. Thanks!

I'm pretty sure this is not correct as-is, it suffers from the same issue as
https://postgr.es/m/vf4hbwrotvhbgcnknrqmfbqlu75oyjkmausvy66ic7x7vuhafx%40e4rvwavtjswo
I.e. it is not safe to use GetXLogInsertRecPtr() to determine up to where to
flush to, due to page boundaries.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2026-03-10 17:18:42 Re: support fast default for domain with constraints
Previous Message Nathan Bossart 2026-03-10 17:10:05 Re: Speed up COPY FROM text/CSV parsing using SIMD