Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Date: 2022-03-30 16:21:30
Message-ID: 20220330162008.GA784267@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 29, 2022 at 03:48:32PM -0700, Nathan Bossart wrote:
> On Thu, Mar 24, 2022 at 01:17:01PM +1300, Thomas Munro wrote:
>> /* we're only handling directories here, skip if it's not ours */
>> - if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
>> + if (lstat(path, &statbuf) != 0)
>> + ereport(ERROR,
>> + (errcode_for_file_access(),
>> + errmsg("could not stat file \"%s\": %m", path)));
>> + else if (!S_ISDIR(statbuf.st_mode))
>> return;
>>
>> Why is this a good place to silently ignore non-directories?
>> StartupReorderBuffer() is already in charge of skipping random
>> detritus found in the directory, so would it be better to do "if
>> (get_dirent_type(...) != PGFILETYPE_DIR) continue" there, and then
>> drop the lstat() stanza from ReorderBufferCleanupSeralizedTXNs()
>> completely? Then perhaps its ReadDirExtended() shoud be using ERROR
>> instead of INFO, so that missing/non-dir/b0rked directories raise an
>> error.
>
> My guess is that this was done because ReorderBufferCleanupSerializedTXNs()
> is also called from ReorderBufferAllocate() and ReorderBufferFree().
> However, it is odd that we just silently return if the slot path isn't a
> directory in those cases. I think we could use get_dirent_type() in
> StartupReorderBuffer() as you suggested, and then we could let ReadDir()
> ERROR for non-directories for the other callers of
> ReorderBufferCleanupSerializedTXNs(). WDYT?
>
>> I don't understand why it's reporting readdir() errors at INFO
>> but unlink() errors at ERROR, and as far as I can see the other paths
>> that reach this code shouldn't be sending in paths to non-directories
>> here unless something is seriously busted and that's ERROR-worthy.
>
> I agree. I'll switch it to ReadDir() in the next revision so that we ERROR
> instead of INFO.

Here is an updated patch set.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v11-0001-make-more-use-of-get_dirent_type.patch text/x-diff 12.4 KB
v11-0002-minor-improvements-to-replication-code.patch text/x-diff 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-03-30 16:23:11 Re: pgsql: Add 'basebackup_to_shell' contrib module.
Previous Message Tom Lane 2022-03-30 16:18:39 Re: Frontend error logging style