Re: Requested WAL segment xxx has already been removed

From: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
To: Japin Li <japinli(at)hotmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Requested WAL segment xxx has already been removed
Date: 2025-07-15 09:24:35
Message-ID: CAFh8B=nH=41scx3F_EAh2H5O1w-nj8b7uCMp_0z4p4wsv2tFDA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, 14 Jul 2025 at 11:24, Japin Li <japinli(at)hotmail(dot)com> wrote:

> The configuration is as expected. My test script simulates two distinct
> hosts
> by utilizing local archive storage.
>
> For physical replication across distinct hosts without shared WAL archive
> storage, WALs are archived locally (in my test).
>
> When the primary's walsender needs a WAL file from the archive that's not
> in
> its pg_wal directory, manual copying is required to the primary's pg_wal
> or the
> standby's pg_wal (or its archive directory, and use restore_command to
> fetch it).
>
> What prevents us from using the primary's restore_command to retrieve the
> necessary WALs?
>

I am just talking about the practical side of local archive storage.
Such archives will be gone along with the server in case of disaster and
therefore they bring only a little value.
With the same success, physical standby can use restore_command to copy
files from the archive on the primary via ssh/rsync or similar. This
approach is used for ages and works just fine.

What is really painful right now, logical walsenders can only look into
pg_wal, and unfortunately replication slots don't give 100% guarantee for
WAL retention because of max_slot_wal_keep_size.
That is, using restore_command for logical walsenders would be really
helpful and solve some problems and pain points with logical replication.

However, if we start calling restore_command also for physical walsenders
it might result in increased resource usage on primary without providing
much additional value. For example, restore_command is failing, but standby
indefinitely continues making replication connection attempts.

I don't mind if it will also work for physical replication, but IMO there
should be a possibility to opt out from it.

Regards,
--
Alexander Kukushkin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message cca5507 2025-07-15 09:25:58 Logical replication launcher did not automatically restart when got SIGKILL
Previous Message Hayato Kuroda (Fujitsu) 2025-07-15 09:14:27 RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages