Re: avoid multiple hard links to same WAL file after a crash

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: avoid multiple hard links to same WAL file after a crash
Date: 2022-04-27 07:09:20
Message-ID: YmjsIGw3SS2zb5Yd@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 26, 2022 at 01:09:35PM -0700, Nathan Bossart wrote:
> Here is an attempt at creating something that can be back-patched. 0001
> simply replaces calls to durable_rename_excl() with durable_rename() and is
> intended to be back-patched. 0002 removes the definition of
> durable_rename_excl() and is _not_ intended for back-patching. I imagine
> 0002 will need to be held back for v16devel.

I would not mind applying 0002 on HEAD now to avoid more uses of this
API, and I can get behind 0001 after thinking more about it.

> I think back-patching 0001 will encounter a couple of small obstacles. For
> example, the call in basic_archive won't exist on most of the
> back-branches, and durable_rename_excl() was named durable_link_or_rename()
> before v13. I don't mind producing a patch for each back-branch if needed.

I am not sure that have any need to backpatch this change based on the
unlikeliness of the problem, TBH. One thing that is itching me a bit,
like Robert upthread, is that we don't check anymore that the newfile
does not exist in the code paths because we never expect one. It is
possible to use stat() for that. But access() within a simple
assertion would be simpler? Say something like:
Assert(access(path, F_OK) != 0 && errno == ENOENT);

The case for basic_archive is limited as the comment of the patch
states, but that would be helpful for the two calls in timeline.c and
the one in xlog.c in the long-term. And this has no need to be part
of fd.c, this can be added before the durable_rename() calls. What do
you think?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-04-27 07:17:35 Re: Fix NULL pointer reference in _outPathTarget()
Previous Message Michael Paquier 2022-04-27 06:12:35 Re: bogus: logical replication rows/cols combinations