Re: POC: Cleaning up orphaned files using undo logs

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-07-20 08:25:45
Message-ID: CAFiTN-smtk_LaO36f9=TBKp-wbbHXxTKv817B1RDsjVDPX3G6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 20, 2019 at 12:40 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Jul 19, 2019 at 6:35 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Fri, Jul 19, 2019 at 12:10 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > We are doing exactly what you have written in the last line of the
> > > next paragraph "stop the transaction from writing undo when the hash
> > > table is already too full.". So we will
> > > never face the problems related to repeated crash recovery. The
> > > definition of too full is that we stop allowing the new transactions
> > > that can write undo when we have the hash table already have entries
> > > equivalent to (UndoRollbackHashTableSize() - MaxBackends). Does this
> > > make sense?
> >
> > Oops, I was looking in the wrong place. Yes, that makes sense, but:
> >
> > 1. It looks like the test you added to PrepareUndoInsert has no
> > locking, and I don't see how that can be right.
> >
> > 2. It seems like this is would result in testing for each new undo
> > insertion that gets prepared, whereas surely we would want to only
> > test when first attaching to an undo log. If you've already attached
> > to the undo log, there's no reason not to continue inserting into it,
> > because doing so doesn't increase the number of transactions (or
> > transaction-persistence level combinations) that need undo.
> >
>
> I agree that it should not be for each undo insertion rather whenever
> any transaction attached to an undo log.
>
> > 3. I don't think the test itself is correct. It can fire even when
> > there's no problem. It is correct (or would be if it said 2 *
> > MaxBackends) if every other backend in the system is already attached
> > to an undo log (or two). But if they are not, it will block
> > transactions from being started for no reason.
> >
>
> Right, we should find a way to know the exact number of transactions
> that are attached to undo-log at any point in time, then we can have a
> more precise check.

Maybe we can make ProcGlobal->xactsHavingPendingUndo an atomic
variable. We can increment its value atomically whenever
a) A transaction writes the first undo record for each persistence level.
b) For each abort request inserted by the 'StartupPass'.

And, we will decrement it when
a) The transaction commits (decrement by 1 for each persistence level
it has written undo for).
b) Rollback request is processed.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-07-20 09:17:33 Re: pgsql: Sync our copy of the timezone library with IANA release tzcode20
Previous Message Dent John 2019-07-20 08:21:42 Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead