Quick Links

Re: POC: Cleaning up orphaned files using undo logs

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Antonin Houska <ah(at)cybertec(dot)at>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: POC: Cleaning up orphaned files using undo logs
Date:	2020-11-15 05:56:01
Message-ID:	CAA4eK1KjE-c=00dA6qYqUSg=wd35rXK94T12y9fU-zc4VpPS-Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Nov 15, 2020 at 11:24 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > > >
> > > >
> > > > No background undo
> > > > ------------------
> > > >
> > > > Reduced complexity of the patch seems to be the priority at the moment. Amit
> > > > suggested that cleanup of an orphaned relation file is simple enough to be
> > > > done on foreground and I agree.
> > > >
> > >
> > > Yeah, I think we should try and see if we can make it work but I
> > > noticed that there are few places like AbortOutOfAnyTransaction where
> > > we have the assumption that undo will be executed in the background.
> > > We need to deal with it.
> >
> > I think this is o.k. if we always check for unapplied undo during startup.
> >
>
> Hmm, how it is ok to leave undo (and rely on startup) unless it is a
> PANIC error. IIRC, this path is invoked in non-panic errors as well.
> Basically, we won't be able to discard such an undo which doesn't seem
> like a good idea.
>
> > > > "undo worker" is still there, but it only processes undo requests after server
> > > > restart because relation data can only be changed in a transaction - it seems
> > > > cleaner to launch a background worker for this than to hack the startup
> > > > process.
> > > >
> > >
> > > But, I see there are still multiple undoworkers that are getting
> > > launched and I am not sure if that works correctly because a
> > > particular undoworker is connected to a database and then it starts
> > > processing all the pending undo.
> >
> > Each undo worker applies only transactions for its own database, see
> > ProcessExistingUndoRequests():
> >
> > /* We can only process undo of the database we are connected to. */
> > if (xact_hdr.dboid != MyDatabaseId)
> > continue;
> >
> > Nevertheless, as I've just mentioned in my response to Thomas, I admit that we
> > should try to live w/o the undo worker altogether.
> >
>
> Okay, but keep in mind that there could be a large amount of undo
> (unlike redo which has some limit as we can replay it from the last
> checkpoint) which needs to be processed but it might be okay to live
> with that for now. Another thing is that it seems we need to connect
> to the database to perform it which might appear a bit odd that we
> don't allow users to connect to the database but internally we are
> connecting it. These are just some points to consider while finalizing
> the solution to this.
>
> > > > Discarding
> > > > ----------
> > > >
> > > > There's no discard worker for the URS infrastructure yet. I thought about
> > > > discarding the undo log during checkpoint, but checkpoint should probably do
> > > > more straightforward tasks than the calculation of a new discard pointer for
> > > > each undo log, so a background worker is needed. A few notes on that:
> > > >
> > > > * until the zheap AM gets added, only the transaction that creates the undo
> > > > records needs to access them. This assumption should make the discarding
> > > > algorithm a bit simpler. Note that with zheap, the other transactions need
> > > > to look for old versions of tuples, so the concept of oldestXidHavingUndo
> > > > variable is needed there.
> > > >
> > > > * it's rather simple to pass pointer the URS pointer to the discard worker
> > > > when transaction either committed or the undo has been executed.
> > > >
> > >
> > > Why can't we have a separate discard worker which keeps on scanning
> > > the undorecords and discard accordingly? Giving the onus of foreground
> > > process might be tricky because say discard worker is not up to speed
> > > and we ran out of space to pass such information for each commit/abort
> > > request.
> >
> > Sure, there should be a discard worker. The question is how to make its work
> > efficient. The initial run after restart probably needs to scan everything
> > between 'discard' and 'insert' pointers,
> >
>
> Yeah, such an initial scan would be helpful to identify pending aborts
> and allow them to be processed.
>
> > but then it should process only the
> > parts created by individual transactions.
> >
>
> Yeah, it needs to process transaction-by-transaction to see which all
> we can discard. Also, note that in Single-User mode we need to discard
> undo after commit. I think we also need to maintain
> oldestXidHavingUndo for CLOG truncation and transaction-wraparound. We
> can't allow CLOG truncation for the transaction whose undo is not
> discarded as that could be required by some other transaction. For
> similar reasons, we can't allow transaction-wraparound and we need to
> integrate this into the existing xid-allocation mechanism. I have
> found one of the old patch
> (Allow-execution-and-discard-of-undo-by-background-wo) attached
>

oops, forgot to attach the patch, doing now.

--
With Regards,
Amit Kapila.

Attachment	Content-Type	Size
Allow-execution-and-discard-of-undo-by-background-wo.patch	application/octet-stream	90.8 KB

In response to

Re: POC: Cleaning up orphaned files using undo logs at 2020-11-15 05:54:06 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dilip Kumar	2020-11-15 06:04:09	Re: logical streaming of xacts via test_decoding is broken
Previous Message	Alexander Korotkov	2020-11-15 05:55:13	Re: Supporting = operator in gin/gist_trgm_ops