Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Date: 2022-08-10 08:41:04
Message-ID: CA+hUKGL30BKpudMLYP++hTvK+O5DirMkwEg6yWQ9BLycKfPUvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 10, 2022 at 7:15 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> The main idea of using get_dirent_type() instead of lstat or stat is
> to benefit from avoiding system calls on platforms where the directory
> entry type is stored in dirent structure, but not to cause errors for
> lstat or stat system calls failures where we previously didn't.

Yeah, I get it... I'm OK with separating mechanical changes like
lstat() -> get_dirent_type() from policy changes on error handling. I
initially thought the ERROR was surely better than what we're doing
now (making extra system calls to avoid unlinking unexpected things,
for which our only strategy on failure is to try to unlink the thing
anyway), but it's better to discuss that separately.

> I
> think 0001 must be just replacing lstat or stat with get_dirent_type()
> with elevel ERROR if lstat or stat failure is treated as ERROR
> previously, otherwise with elevel LOG. I modified it as attached. Once
> this patch gets committed, then we can continue the discussion as to
> whether we make elog-LOG to elog-ERROR or vice-versa.

Agreed, will look at this version tomorrow.

> I'm tempted to use get_dirent_type() in RemoveXlogFile() but that
> requires us to pass struct dirent * from RemoveOldXlogFiles() (as
> attached 0002 for now), If done, this avoids one lstat() system call
> per WAL file. IMO, that's okay to pass an extra function parameter to
> RemoveXlogFile() as it is a static function and there can be many
> (thousands of) WAL files at times, so saving system calls (lstat() *
> number of WAL files) is definitely an advantage.

- lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
+ get_dirent_type(path, xlde, false, LOG) != PGFILETYPE_REG &&

I think you wanted ==, but maybe it'd be nicer to pass in a
"recyclable" boolean to RemoveXlogFile()?

If you're looking for more, rmtree() looks like another.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2022-08-10 08:46:03 Re: Fix a typo in pgstatfuncs.c
Previous Message Alvaro Herrera 2022-08-10 08:40:35 Re: Using each rel as both outer and inner for JOIN_ANTI