Re: POC: Cleaning up orphaned files using undo logs

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2020-11-26 09:28:21
Message-ID: 16128.1606382901@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Wed, Nov 25, 2020 at 7:47 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > > I think we also need to maintain oldestXidHavingUndo for CLOG truncation and
> > > > transaction-wraparound. We can't allow CLOG truncation for the transaction
> > > > whose undo is not discarded as that could be required by some other
> > > > transaction.
> > >
> > > Good point. Even the discard worker might need to check the transaction status
> > > when deciding whether undo log of that transaction should be discarded.
> >
> > In the zheap code [1] I see that DiscardWorkerMain() discards undo log up to
> > OldestXmin:
> >
> >
> > OldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_AUTOVACUUM |
> > PROCARRAY_FLAGS_VACUUM);
> >
> > oldestXidHavingUndo = GetXidFromEpochXid(pg_atomic_read_u64(&ProcGlobal->oldestXidWithEpochHavingUndo));
> >
> > /*
> > * Call the discard routine if there oldestXidHavingUndo is lagging
> > * behind OldestXmin.
> > */
> > if (OldestXmin != InvalidTransactionId &&
> > TransactionIdPrecedes(oldestXidHavingUndo, OldestXmin))
> > {
> > UndoDiscard(OldestXmin, &hibernate);
> >
> > and that UndoDiscard() eventually advances oldestXidHavingUndo in the shared
> > memory.
> >
> > I'm not sure this is correct because, IMO, OldestXmin can advance as soon as
> > AbortTransaction() has cleared both xid and xmin fields of the transaction's
> > PGXACT (by calling ProcArrayEndTransactionInternal). However the corresponding
> > undo log may still be waiting for processing. Am I wrong?

> The UndoDiscard->UndoDiscardOneLog ensures that we don't discard the
> undo if there is a pending abort.

ok, I should have dug deeper than just reading the header comment of
UndoDiscard(). Checked now and seem to understand why no information is lost.

Nevertheless, I see in the zheap code that the discard worker may need to scan
a lot of undo log each time. While the oldest_xid and oldest_data fields of
UndoLogControl help to skip parts of the log, I'm not sure such information
fits into the undo-record-set (URS) approach. For now I tend to try to
implement the "exhaustive" scan for the URS too, and later let's teach the
discard worker to store some metadata so that the processing is rather
incremental.

> > I think that oldestXidHavingUndo should be advanced at the time transaction
> > commits or when the undo log of an aborted transaction has been applied.
> >
>
> We can't advance oldestXidHavingUndo just on commit because later we
> need to rely on it for visibility, basically any transaction older
> than oldestXidHavingUndo should be all-visible.

ok

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2020-11-26 09:47:15 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Bharath Rupireddy 2020-11-26 09:15:20 Re: Parallel Inserts in CREATE TABLE AS