Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory
Date: 2022-04-25 11:47:41
Message-ID: CALj2ACV=+KP6jpL_NpTzRnTRahL_DmKdNftrezyTwQEZxnN_BA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 25, 2022 at 6:38 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Fri, Apr 22, 2022 at 07:17:37PM +0530, Bharath Rupireddy wrote:
> > Right. We find enough disk space and go to write and suddenly the
> > write operations fail for some reason or the VM crashes because of a
> > reason other than disk space. I think the foolproof solution is to
> > figure out the available disk space before prepadding or compressing
> > and also use the
> > write-first-to-temp-file-and-then-rename-it-to-original-file as
> > proposed in the earlier patches in this thread.
>
> Yes, what would count here is only the amount of free space in a
> partition. The total amount of space available becomes handy once you
> begin introducing things like percentage-based quota policies for the
> disk when archiving. The free amount of space could be used to define
> a policy based on the maximum number of bytes you need to leave
> around, as well, but this is not perfect science as this depends of
> what FSes decide to do underneath. There are a couple of designs
> possible here. When I had to deal with my upthread case I have chosen
> one as I had no need to worry only about Linux, it does not mean that
> this is the best choice that would fit with the long-term community
> picture. This comes down to how much pg_receivewal should handle
> automatically, and how it should handle it.

Thanks. I'm not sure why we are just thinking of crashes due to
out-of-disk space. Figuring out free disk space before writing a huge
file (say a WAL file) is a problem in itself to the core postgres as
well, not just pg_receivewal.

I think we are off-track a bit here. Let me illustrate what's the
whole problem is and the idea:

If the node/VM on which pg_receivewal runs, goes down/crashes or fails
during write operation while padding the target WAL file (the .partial
file) with zeros, the unfilled target WAL file ((let me call this file
a partially padded .partial file) will be left over and subsequent
reads/writes to that it will fail with "write-ahead log file \"%s\"
has %zd bytes, should be 0 or %d" error which requires manual
intervention to remove it. In a service, this manual intervention is
what we would like to avoid. Let's not much bother right now for
compressed file writes (for now at least) as they don't have a
prepadding phase.

The proposed solution is to make the prepadding atomic - prepad the
XXXX.partial file as XXXX.partial.tmp name and after the prepadding
rename (durably if sync option is chosen for pg_receivewal) to
XXXX.partial. Before prepadding XXXX.partial.tmp, delete the
XXXX.partial.tmp if it exists.

The above problem isn't unique to pg_receivewal alone, pg_basebackup
too uses CreateWalDirectoryMethod and dir_open_for_write via
ReceiveXlogStream.

IMHO, pg_receivewal checking for available disk space before writing
any file should better be discussed separately?

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2022-04-25 13:05:59 Re: tweak to a few index tests to hits ambuildempty() routine.
Previous Message Bharath Rupireddy 2022-04-25 11:03:33 Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL