Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory
Date: 2022-04-18 09:20:17
Message-ID: CALj2ACXDFi+cvyB6oODe1P9ACc5pMxjtWPTaE-9-7SKezF7QTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 12, 2022 at 5:34 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Apr 11, 2022 at 01:21:23PM -0700, SATYANARAYANA NARLAPURAM wrote:
> > Correct. The idea is to make sure the file is fully allocated before
> > treating it as a current file.
>
> Another problem comes to compression, as the pre-padding cannot be
> applied in this case because zlib and lz4 don't know the size of the
> compressed segment until we reach 16MB of data received, but you can
> get a good estimate as long as you know how much space is left on a
> device. FWIW, I had to deal with this problem a couple of years ago
> for the integration of an archiver in a certain thing, and the
> requirement was that the WAL archiver service had to be a maximum
> self-aware and automated, which is what you wish to achieve here. It
> basically came down to measure how much WAL one wishes to keep in the
> WAL archives for the sizing of the disk partition storing the
> archives (aka how much back in time you want to go), in combination to
> how much WAL would get produced on a rather-linear production load.
>
> Another thing is that you never really want to stress too much your
> partition so as it gets filled at 100%, as there could be opened files
> and the kind that consume more space than the actual amount of data
> stored, but you'd usually want to keep up to 70~90% of it. At the
> end, we finished with:
> - A dependency to statvfs(), which is not portable on WIN32, to find
> out how much space was left on the partition (f_blocks*f_bsize for
> the total size and f_bfree*f_bsize for the free size I guess, by
> looking at its man page).
> - Control the amount of WAL to keep around using a percentage rate of
> maximum disk space allowed (or just a percentage of free disk space),
> with pg_receivewal doing a cleanup of up to WalSegSz worth of data for
> the oldest segments. The segments of the oldest TLIs are removed
> first. For any compression algorithm, unlinking this much amount of
> data is not necessary but that's fine as you usually just remove one
> compressed or uncompressed segment per cycle, at it does not matter
> with dozens of gigs worth of WAL archives, or even more.

Thanks for sharing this. Will the write operations (in
dir_open_for_write) for PG_COMPRESSION_GZIP and PG_COMPRESSION_LZ4
take longer compared to prepadding for non-compressed files?

I would like to know if there's any problem with the proposed fix.

I think we need the same fix proposed in this thread for
tar_open_for_write as well because it also does prepadding for
non-compressed files.

In general, I agree that making pg_receivewal self-aware and
automating things by itself is really a great idea. This will avoid
manual effort. For instance, pg_receivewal can try with different
streaming start LSNs (restart_lsn of its slot or server insert LSN)
not just the latest LSN found in its target directory which will
particularly be helpful in case its source server has changed the
timeline or for some reason unable to serve the WAL.

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-04-18 09:40:46 Re: Skipping schema changes in publication
Previous Message Erikjan Rijkers 2022-04-18 08:57:02 Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508