Re: POC: Cleaning up orphaned files using undo logs

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-07-04 11:53:52
Message-ID: CAA4eK1LHq2T3T4pQOnaO0A=pOeCGc5gpCTuPYp8_Sc7nLV7tig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 1, 2019 at 1:24 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> Another small change/review: the function UndoLogGetNextInsertPtr()
> previously took a transaction ID, but I'm not sure if that made sense,
> I need to think about it some more.
>

The changes you have made related to UndoLogGetNextInsertPtr() doesn't
seem correct to me.

@@ -854,7 +854,9 @@ FindUndoEndLocationAndSize(UndoRecPtr start_urecptr,
* has already started in this log then lets re-fetch the undo
* record.
*/
- next_insert = UndoLogGetNextInsertPtr(slot->logno, uur->uur_xid);
+ next_insert = UndoLogGetNextInsertPtr(slot->logno);
+
+ /* TODO this can't happen */
if (!UndoRecPtrIsValid(next_insert))

I think this is a possible case. Say while the discard worker tries
to register the rollback request from some log and after it fetches
the undo record corresponding to start location in this function,
another backend adds the new transaction undo. The same is mentioned
in comments as well. Can you explain what makes you think that this
can't happen? If we don't want to pass the xid to
UndoLogGetNextInsertPtr, then I think we need to get the insert
location before fetching the record. I will think more on it to see
if there is any other problem with the same.

2.
@@ -167,25 +205,14 @@ UndoDiscardOneLog(UndoLogSlot *slot,
TransactionId xmin, bool *hibernate)
+ if (!TransactionIdIsValid(wait_xid) && !pending_abort)
{
UndoRecPtr next_insert = InvalidUndoRecPtr;
- /*
- * If more undo has been inserted since we checked last, then
- * we can process that as well.
- */
- next_insert = UndoLogGetNextInsertPtr(logno, undoxid);
- if (!UndoRecPtrIsValid(next_insert))
- continue;
+ next_insert = UndoLogGetNextInsertPtr(logno);

This change is also not safe. It can lead to discarding the undo of
some random transaction because a new undo records from some other
transaction would have been added since we last fetched the undo
record. This can be fixed by just removing the call to
UndoLogGetNextInsertPtr. I have done so in undoprocessing branch and
added the comment as well.

I think the common problem with the above changes is that it assumes
that new undo can't be added to undo logs while discard worker is
processing them.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2019-07-04 11:54:51 Re: Minimal logical decoding on standbys
Previous Message Amit Khandekar 2019-07-04 11:51:47 Re: Minimal logical decoding on standbys