Re: POC: Cleaning up orphaned files using undo logs

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2020-11-26 03:02:58
Message-ID: CAA4eK1+XR05PxX+e_QrJusQcTMX_orUWNcRthX=UT5DDqN4YNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 25, 2020 at 8:00 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > On Wed, Nov 18, 2020 at 4:03 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > >
> > > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > > On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > > > >
> > > > > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > > > > > >
> > > > > > >
> > > > > > > No background undo
> > > > > > > ------------------
> > > > > > >
> > > > > > > Reduced complexity of the patch seems to be the priority at the moment. Amit
> > > > > > > suggested that cleanup of an orphaned relation file is simple enough to be
> > > > > > > done on foreground and I agree.
> > > > > > >
> > > > > >
> > > > > > Yeah, I think we should try and see if we can make it work but I
> > > > > > noticed that there are few places like AbortOutOfAnyTransaction where
> > > > > > we have the assumption that undo will be executed in the background.
> > > > > > We need to deal with it.
> > > > >
> > > > > I think this is o.k. if we always check for unapplied undo during startup.
> > > > >
> > > >
> > > > Hmm, how it is ok to leave undo (and rely on startup) unless it is a
> > > > PANIC error. IIRC, this path is invoked in non-panic errors as well.
> > > > Basically, we won't be able to discard such an undo which doesn't seem
> > > > like a good idea.
> > >
> > > Since failure to apply leaves unconsistent data, I assume it should always
> > > cause PANIC, shouldn't it?
> > >
> >
> > But how can we ensure that AbortOutOfAnyTransaction will be called
> > only in that scenario?
>
> I meant that AbortOutOfAnyTransaction should PANIC itself if it sees that
> there is unapplied undo, so nothing changes for its callers. Do I still miss
> something?
>

Adding PANIC in some generic code-path sounds scary. Why can't we
simply try to execute undo?

> > > > Another thing is that it seems we need to connect to the database to perform
> > > > it which might appear a bit odd that we don't allow users to connect to the
> > > > database but internally we are connecting it.
> > >
> > > I think the implementation will need to follow the outcome of the part of the
> > > discussion that starts at [2], but I see your concern. I'm thinking why
> > > database connection is not needed to apply WAL but is needed for UNDO. I think
> > > locks make the difference.
> > >
> >
> > Yeah, it would be probably a good idea to see if we can make undo
> > apply work without db-connection especially if we want to do before
> > allowing connections. The other possibility could be to let discard
> > worker do this work lazily after allowing connections.
>
> Actually I hit the problem of missing connection when playing with the
> "undoxacttest" module. Those tests use table_open() / table_close() functions,
> but it might not be necessary for the real RMGRs.
>

How can we apply the action on a page without opening the relation?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message k.jamison@fujitsu.com 2020-11-26 03:04:10 RE: [Patch] Optimize dropping of relation buffers using dlist
Previous Message Ajin Cherian 2020-11-26 02:55:18 Re: [HACKERS] logical decoding of two-phase transactions