From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: Cleaning up orphaned files using undo logs |
Date: | 2021-09-25 06:46:40 |
Message-ID: | CAA4eK1+UtutcnUY4LgfS_ndA81tEDr5F67WVFixLYejGObW0Og@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 24, 2021 at 4:44 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > On Mon, Sep 20, 2021 at 10:24 AM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > >
> > > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > > On Fri, Sep 17, 2021 at 9:50 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
> > > > >
> > > > > > On Tue, Sep 14, 2021 at 10:51:42AM +0200, Antonin Houska wrote:
> > > > >
> > > > > * What happened with the idea of abandoning discard worker for the sake
> > > > > of simplicity? From what I see limiting everything to foreground undo
> > > > > could reduce the core of the patch series to the first four patches
> > > > > (forgetting about test and docs, but I guess it would be enough at
> > > > > least for the design review), which is already less overwhelming.
>
> > > What we can miss, at least for the cleanup of the orphaned files, is the *undo
> > > worker*. In this patch series the cleanup is handled by the startup process.
> > >
> >
> > Okay, I think various people at different point of times has suggested
> > that idea. I think one thing we might need to consider is what to do
> > in case of a FATAL error? In case of FATAL error, it won't be
> > advisable to execute undo immediately, so would we upgrade the error
> > to PANIC in such cases. I remember vaguely that for clean up of
> > orphaned files that can happen rarely and someone has suggested
> > upgrading the error to PANIC in such a case but I don't remember the
> > exact details.
>
> Do you mean FATAL error during normal operation?
>
Yes.
> As far as I understand, even
> zheap does not rely on immediate UNDO execution (otherwise it'd never
> introduce the undo worker), so FATAL only means that the undo needs to be
> applied later so it can be discarded.
>
Yeah, zheap either applies undo later via background worker or next
time before dml operation if there is a need.
> As for the orphaned files cleanup feature with no undo worker, we might need
> PANIC to ensure that the undo is applied during restart and that it can be
> discarded, otherwise the unapplied undo log would stay there until the next
> (regular) restart and it would block discarding. However upgrading FATAL to
> PANIC just because the current transaction created a table seems quite
> rude.
>
True, I guess but we can once see in what all scenarios it can
generate FATAL during that operation.
> So the undo worker might be needed even for this patch?
>
I think we can keep undo worker as a separate patch and for base patch
keep the idea of promoting FATAL to PANIC. This will at the very least
make the review easier.
> Or do you mean FATAL error when executing the UNDO?
>
No.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-09-25 07:23:11 | Re: Skipping logical replication transactions on subscriber side |
Previous Message | Amit Kapila | 2021-09-25 05:29:46 | Re: row filtering for logical replication |