From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work |
Date: | 2022-01-15 08:34:12 |
Message-ID: | CALj2ACV+acrnWUdwSNUXNzXLNA+kFkfuT8t=wiMoPhBWKrWUeA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 14, 2022 at 1:08 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2021-12-31 18:12:37 +0530, Bharath Rupireddy wrote:
> > Currently the server is erroring out when unable to remove/parse a
> > logical rewrite file in CheckPointLogicalRewriteHeap wasting the
> > amount of work the checkpoint has done and preventing the checkpoint
> > from finishing.
>
> This seems like it'd make failures to remove the files practically
> invisible. Which'd have it's own set of problems?
>
> What motivated proposing this change?
We had an issue where there were many mapping files generated during
the crash recovery and end-of-recovery checkpoint was taking a lot of
time. We had to manually intervene and delete some of the mapping
files (although it may not sound sensible) to make end-of-recovery
checkpoint faster. Because of the race condition between manual
deletion and checkpoint deletion, the unlink error occurred which
crashed the server and the server entered the recovery again wasting
the entire earlier recovery work.
In summary, with the changes (emitting LOG-only messages for unlink
failures and continuing with the other files) proposed for
CheckPointLogicalRewriteHeap in this thread and the existing code in
CheckPointSnapBuild, I'm sure it will help not waste the recovery
that's has been done in case unlink fails for any reasons.
Regards,
Bharath Rupireddy.
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2022-01-15 08:50:12 | Re: pg_replslotdata - a tool for displaying replication slot information |
Previous Message | Julien Rouhaud | 2022-01-15 08:33:04 | Re: missing indexes in indexlist with partitioned tables |