Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_receivewal fail to streams when the partial file to write is not fully initialized present in the wal receiver directory
Date: 2022-04-12 00:03:51
Message-ID: YlTB5yiuzrpdC+X9@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 11, 2022 at 01:21:23PM -0700, SATYANARAYANA NARLAPURAM wrote:
> Correct. The idea is to make sure the file is fully allocated before
> treating it as a current file.

Another problem comes to compression, as the pre-padding cannot be
applied in this case because zlib and lz4 don't know the size of the
compressed segment until we reach 16MB of data received, but you can
get a good estimate as long as you know how much space is left on a
device. FWIW, I had to deal with this problem a couple of years ago
for the integration of an archiver in a certain thing, and the
requirement was that the WAL archiver service had to be a maximum
self-aware and automated, which is what you wish to achieve here. It
basically came down to measure how much WAL one wishes to keep in the
WAL archives for the sizing of the disk partition storing the
archives (aka how much back in time you want to go), in combination to
how much WAL would get produced on a rather-linear production load.

Another thing is that you never really want to stress too much your
partition so as it gets filled at 100%, as there could be opened files
and the kind that consume more space than the actual amount of data
stored, but you'd usually want to keep up to 70~90% of it. At the
end, we finished with:
- A dependency to statvfs(), which is not portable on WIN32, to find
out how much space was left on the partition (f_blocks*f_bsize for
the total size and f_bfree*f_bsize for the free size I guess, by
looking at its man page).
- Control the amount of WAL to keep around using a percentage rate of
maximum disk space allowed (or just a percentage of free disk space),
with pg_receivewal doing a cleanup of up to WalSegSz worth of data for
the oldest segments. The segments of the oldest TLIs are removed
first. For any compression algorithm, unlinking this much amount of
data is not necessary but that's fine as you usually just remove one
compressed or uncompressed segment per cycle, at it does not matter
with dozens of gigs worth of WAL archives, or even more.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2022-04-12 00:18:58 Re: random() function documentation
Previous Message Andres Freund 2022-04-11 22:50:49 Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?