From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work |
Date: | 2022-08-09 03:59:16 |
Message-ID: | CALj2ACX3u2hvp=jiOZ20m0dKXxieNy=uhYfP6d1WTZh0pdso0A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 9, 2022 at 9:20 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > On Tue, Aug 9, 2022 at 3:27 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I have not tried to analyze the error-handling properties of 0001,
> >> but if it's being equally cavalier then it shouldn't be committed
> >> either.
>
> > 0001 does introduce new errors, as mentioned in the commit message, in
> > the form of elevel ERROR passed into get_dirent_type(), which might be
> > thrown if your OS has no d_type and lstat() fails (also if you asked
> > to follow symlinks, but in those cases errors were already thrown).
> > But in those cases, it seems at least a little fishy that we ignored
> > errors from the existing lstat().
>
> Hmmm ... I'll grant that ignoring lstat errors altogether isn't great.
> But should the replacement behavior be elog-LOG-and-press-on,
> or elog-ERROR-and-fail-the-surrounding-operation? I'm not in any
> hurry to believe that the latter is more appropriate without some
> analysis of what the callers are doing.
>
> The bottom line here is that I'm distrustful of behavioral changes
> introduced to simplify refactoring rather than to solve a live
> problem.
+1. I agree with Tom not to change elog-LOG to elog-ERROR and fail the
checkpoint operation. Because the checkpoint is more important than
why a single snapshot file (out thousands or even million files) isn't
removed at that moment. Also, I originally proposed to change
elog-ERROR to elog-LOG in CheckPointLogicalRewriteHeap for unlink()
failures for the same reason.
--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2022-08-09 04:07:54 | Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work |
Previous Message | Nathan Bossart | 2022-08-09 03:51:29 | Re: optimize lookups in snapshot [sub]xip arrays |