Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Date: 2022-01-27 01:01:09
Message-ID: 20220127010109.GA352793@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 21, 2022 at 11:49:56AM -0800, Andres Freund wrote:
> On 2022-01-20 20:41:16 +0000, Bossart, Nathan wrote:
>> Here's this part.
>
> And pushed to all branches. Thanks.

Thanks!

I spent some time thinking about the right way to proceed here, and I came
up with the attached patches. The first patch just adds error checking for
various lstat() calls in the replication code. If lstat() fails, then it
probably doesn't make sense to try to continue processing the file.

The second patch changes some nearby calls to ereport() to ERROR. If these
failures are truly unexpected, and we don't intend to support use-cases
like concurrent manual deletion, then failing might be the right way to go.
I think it's a shame that such failures could cause checkpointing to
continually fail, but that topic is already being discussed elsewhere [0].

[0] https://postgr.es/m/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB%40amazon.com

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com/

Attachment Content-Type Size
v4-0001-add-error-checking-for-calls-to-lstat-in-replicat.patch text/x-diff 3.3 KB
v4-0002-minor-improvements-to-replication-code.patch text/x-diff 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-01-27 01:07:38 Two noncritical bugs of pg_waldump
Previous Message Michael Paquier 2022-01-27 00:56:04 Re: make MaxBackends available in _PG_init