Re: POC: Cleaning up orphaned files using undo logs

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2020-12-04 09:22:42
Message-ID: 49645.1607073762@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > >
> > > If you want to track at undo record level, then won't it lead to
> > > performance overhead and probably additional WAL overhead considering
> > > this action needs to be WAL-logged. I think recording at page-level
> > > might be a better idea.
> >
> > I'm not worried about WAL because the undo execution needs to be WAL-logged
> > anyway - see smgr_undo() in the 0005- part of the patch set. What needs to be
> > evaluated regarding performance is the (exclusive) locking of the page that
> > carries the progress information.
> >
> That is just for one kind of smgr, think how you will do it for
> something like zheap. Their idea is to collect all the undo records
> (unless the undo for a transaction is very large) for one zheap-page
> and apply them together, so maintaining the status at each undo record
> level will surely lead to a large amount of additional WAL. See below
> how and why we have decided to do it differently.
> > I'm still not sure whether this info should
> > be on every page or only in the chunk header. In either case, we have a
> > problem if there are two or more chunks created by different transactions on
> > the same page, and if more than on of these transactions need to perform
> > undo. I tend to believe that this should happen rarely though.
> >
> I think we need to maintain this information at the transaction level
> and need to update it after processing a few blocks, at least that is
> what was decided and implemented earlier. We also need to update it
> when the log is switched or all the actions of the transaction were
> applied. The reasoning is that for short transactions it won't matter
> and for larger transactions, it is good to update it after a few pages
> to avoid WAL and locking overhead. Also, it is better if we collect
> the undo in bulk, this is proved to be beneficial for large
> transactions.

Attached is what I originally did not include in the patch series, see the
part 0012. I have no better idea so far. The progress information is stored in
the chunk header.

To avoid too frequent locking, maybe the UpdateLastAppliedRecord() function
can be modified so it recognizes when it's necessary to update the progress
info. Also the user (zheap) should think when it should call the function.
Since I've included 0012 now as a prerequisite for discarding (0013),
currently it's only necessary to update the progress at undo log chunk

In this version of the patch series I wanted to publish the remaining ideas I
haven't published yet.

> The earlier version of the patch having all these ideas
> implemented is attached
> (Infrastructure-to-execute-pending-undo-actions and
> Provide-interfaces-to-store-and-fetch-undo-records). The second one
> has some APIs used by the first one but the main concepts were
> implemented in the first one
> (Infrastructure-to-execute-pending-undo-actions). I see that in the
> current version these can't be used as it is but still it can give us
> a good start point and we might be able to either re-use some code and
> or ideas from these patches.

Is there a branch with these patches applied? They reference some functions
that I don't see in [1]. I'd like to examine if / how my approach can be
aligned with the current zheap design.


Antonin Houska

Attachment Content-Type Size
undo-20201204.tgz application/x-gzip 176.7 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message hou zhijie 2020-12-04 09:29:01 Re: A new function to wait for the backend exit after termination
Previous Message Bharath Rupireddy 2020-12-04 09:13:28 Re: A new function to wait for the backend exit after termination