Re: Changing shared_buffers without restart

From: Andres Freund <andres(at)anarazel(dot)de>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Changing shared_buffers without restart
Date: 2025-09-18 14:05:01
Message-ID: y2sjrhyylmuc7h77cb5x2b3jhdhsws4stkxiumhde2tq7ewswh@ovfeglsvkihd
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-09-18 09:52:03 -0400, Andres Freund wrote:
> On 2025-09-18 10:25:29 +0530, Ashutosh Bapat wrote:
> > From 0a55bc15dc3a724f03e674048109dac1f248c406 Mon Sep 17 00:00:00 2001
> > From: Dmitrii Dolgov <9erthalion6(at)gmail(dot)com>
> > Date: Fri, 4 Apr 2025 21:46:14 +0200
> > Subject: [PATCH 04/16] Introduce pss_barrierReceivedGeneration
> >
> > Currently WaitForProcSignalBarrier allows to make sure the message sent
> > via EmitProcSignalBarrier was processed by all ProcSignal mechanism
> > participants.
> >
> > Add pss_barrierReceivedGeneration alongside with pss_barrierGeneration,
> > which will be updated when a process has received the message, but not
> > processed it yet. This makes it possible to support a new mode of
> > waiting, when ProcSignal participants want to synchronize message
> > processing. To do that, a participant can wait via
> > WaitForProcSignalBarrierReceived when processing a message, effectively
> > making sure that all processes are going to start processing
> > ProcSignalBarrier simultaneously.
>
> I doubt "online resizing" that requires synchronously processing the same
> event, can really be called "online". There can be significant delays in
> processing a barrier, stalling the entire server until that is reached seems
> like a complete no-go for production systems?

> [...]

> > From 78bc0a49f8ebe17927abd66164764745ecc6d563 Mon Sep 17 00:00:00 2001
> > From: Dmitrii Dolgov <9erthalion6(at)gmail(dot)com>
> > Date: Tue, 17 Jun 2025 14:16:55 +0200
> > Subject: [PATCH 11/16] Allow to resize shared memory without restart
> >
> > Add assing hook for shared_buffers to resize shared memory using space,
> > introduced in the previous commits without requiring PostgreSQL restart.
> > Essentially the implementation is based on two mechanisms: a
> > ProcSignalBarrier is used to make sure all processes are starting the
> > resize procedure simultaneously, and a global Barrier is used to
> > coordinate after that and make sure all finished processes are waiting
> > for others that are in progress.
> >
> > The resize process looks like this:
> >
> > * The GUC assign hook sets a flag to let the Postmaster know that resize
> > was requested.
> >
> > * Postmaster verifies the flag in the event loop, and starts the resize
> > by emitting a ProcSignal barrier.
> >
> > * All processes, that participate in ProcSignal mechanism, begin to
> > process ProcSignal barrier. First a process waits until all processes
> > have confirmed they received the message and can start simultaneously.
>
> As mentioned above, this basically makes the entire feature not really
> online. Besides the latency of some processes not getting to the barrier
> immediately, there's also the issue that actually reserving large amounts of
> memory can take a long time - during which all processes would be unavailable.
>
> I really don't see that being viable. It'd be one thing if that were a
> "temporary" restriction, but the whole design seems to be fairly centered
> around that.

Besides not really being online, isn't this a recipe for endless undetected
deadlocks? What if process A waits for a lock held by process B and process B
arrives at the barrier? Process A won't ever get there, because process B
can't make progress, because A is not making progress.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-09-18 14:19:32 RE: [Patch] add new parameter to pg_replication_origin_session_setup
Previous Message Fujii Masao 2025-09-18 13:54:46 Re: Invalid primary_slot_name triggers warnings in all processes on reload