Re: POC: Cleaning up orphaned files using undo logs

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2020-11-25 15:32:50
Message-ID: 12666.1606318370@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Wed, Nov 18, 2020 at 4:03 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > > >
> > > > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > > > > >
> > > > > >
> > > > > > No background undo
> > > > > > ------------------
> > > > > >
> > > > > > Reduced complexity of the patch seems to be the priority at the moment. Amit
> > > > > > suggested that cleanup of an orphaned relation file is simple enough to be
> > > > > > done on foreground and I agree.
> > > > > >
> > > > >
> > > > > Yeah, I think we should try and see if we can make it work but I
> > > > > noticed that there are few places like AbortOutOfAnyTransaction where
> > > > > we have the assumption that undo will be executed in the background.
> > > > > We need to deal with it.
> > > >
> > > > I think this is o.k. if we always check for unapplied undo during startup.
> > > >
> > >
> > > Hmm, how it is ok to leave undo (and rely on startup) unless it is a
> > > PANIC error. IIRC, this path is invoked in non-panic errors as well.
> > > Basically, we won't be able to discard such an undo which doesn't seem
> > > like a good idea.
> >
> > Since failure to apply leaves unconsistent data, I assume it should always
> > cause PANIC, shouldn't it?
> >
>
> But how can we ensure that AbortOutOfAnyTransaction will be called
> only in that scenario?

I meant that AbortOutOfAnyTransaction should PANIC itself if it sees that
there is unapplied undo, so nothing changes for its callers. Do I still miss
something?

> > > Another thing is that it seems we need to connect to the database to perform
> > > it which might appear a bit odd that we don't allow users to connect to the
> > > database but internally we are connecting it.
> >
> > I think the implementation will need to follow the outcome of the part of the
> > discussion that starts at [2], but I see your concern. I'm thinking why
> > database connection is not needed to apply WAL but is needed for UNDO. I think
> > locks make the difference.
> >
>
> Yeah, it would be probably a good idea to see if we can make undo
> apply work without db-connection especially if we want to do before
> allowing connections. The other possibility could be to let discard
> worker do this work lazily after allowing connections.

Actually I hit the problem of missing connection when playing with the
"undoxacttest" module. Those tests use table_open() / table_close() functions,
but it might not be necessary for the real RMGRs.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-11-25 15:33:09 Re: Prevent printing "next step instructions" in initdb and pg_upgrade
Previous Message Magnus Hagander 2020-11-25 15:25:56 Re: Prevent printing "next step instructions" in initdb and pg_upgrade