Re: POC: Cleaning up orphaned files using undo logs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-05-01 00:32:28
Message-ID: CA+TgmoaKX0f6AEeBNF3iLzJ9A1UdaWPR_eSEh2b2o_yBTkfcEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Replying to myself to resend to the list, since my previous attempt
seems to have been eaten by a grue.

On Tue, Apr 30, 2019 at 11:14 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Apr 30, 2019 at 2:16 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > Like previous version these patch set also applies on:
> > https://github.com/EnterpriseDB/zheap/tree/undo
> > (b397d96176879ed5b09cf7322b8d6f2edd8043a5)
>
> Some more review of 0003:
>
> There is a whitespace-only hunk in xact.c.
>
> It would be nice if AtAbort_ResetUndoBuffers didn't need to exist at
> all. Then, this patch would make no changes whatsoever to xact.c.
> We'd still need such changes in other patches, because the whole idea
> of undo is tightly bound up with the concept of transactions. Yet,
> this particular patch wouldn't touch that file, and that would be
> nice. In general, the reason why we need AtCommit/AtAbort/AtEOXact
> callbacks is to adjust the values of global variables (or the data
> structures to which they point) at commit or abort time. And that is
> also the case here. The thing is, I'm not sure why we need these
> particular global variables. Is there some way that we can get by
> without them? The obvious thing to do seems to be to invent yet
> another context object, allocated via a new function, which can then
> be passed to PrepareUndoInsert, InsertPreparedUndo,
> UndoLogBuffersSetLSN, UnlockReleaseUndoBuffers, etc. This would
> obsolete UndoSetPrepareSize, since you'd instead pass the size to the
> context allocator function.
>
> UndoRecordUpdateTransInfo should declare a local variable
> XactUndoRecordInfo *something = &xact_urec_info[idx] and use that
> variable wherever possible.
>
> It should also probably use while (1) { ... } rather than do { ... }
> while (true).
>
> In UndoBufferGetSlot you could replace 'break;' with 'return i;' and
> then more than half the function would need one less level of
> indentation. This function should also declare PreparedUndoBuffer
> *something and set that variable equal to &prepared_undo_buffers[i] at
> the top of the loop and again after the loop, and that variable should
> then be used whenever possible.
>
> UndoRecordRelationDetails seems to need renaming now that it's down to
> a single member.
>
> The comment for UndoRecordBlock needs updating, as it still refers to blkprev.
>
> The comment for UndoRecordBlockPrev refers to "Identifying
> information" but it's not identifying anything. I think you could
> just delete "Identifying information for" from this sentence and not
> lose anything. And I think several of the nearby comments that refer
> to "Identifying information" could more simply and more correctly just
> refer to "Information".
>
> I don't think that SizeOfUrecNext needs to exist.
>
> xact.h should not add an include for undolog.h. There are no other
> changes to this header, so presumably the header does not depend on
> anything in undolog.h. If .c files that include xact.h need
> undolog.h, then the header should be included in those files, not in
> the header itself. That way, we avoid making partial recompiles more
> painful than necessary.
>
> UndoGetPrevUndoRecptr looks like a strange interface. Why not just
> arrange not to call the function in the first place if prevurp is
> valid?
>
> Every use of palloc0 in this code should be audited to check whether
> it is really necessary to zero the memory before use. If it will be
> initialized before it's used for anything anyway, it doesn't need to
> be pre-zeroed.
>
> FinishUnpackUndo looks nice! But it is missing a blank line in one
> place, and 'if it presents' should be 'if it is present' in a whole
> bunch of places.
>
> BeginInsertUndo also looks to be missing a few blank lines.
>
> Still need to do some cleanup of the UnpackedUndoRecord, e.g. unifying
> uur_xid and uur_xidepoch into uur_fxid.
>
> InsertUndoData ends in an unnecessary return, as does SkipInsertingUndoData.
>
> I think the argument to SkipInsertingUndoData should be the number of
> bytes to skip, rather than the starting byte within the block.
>
> Function header comment formatting is not consistent. Compare
> FinishUnpackUndo (function name recapitulated on first line of
> comment) to ReadUndoBytes (not recapitulated) to UnpackUndoData
> (entire header comment jammed into one paragraph with function name at
> start). I prefer the style where the function name is not restated,
> but YMMV. Anyway, it has to be consistent.
>
> UndoGetPrevRecordLen should not declare char *page instead of Page
> page, I think.
>
> UndoRecInfo looks a bit silly, I think. Isn't index just the index of
> this entry in the array? You can always figure that out by ptr -
> array_base. Instead of having UndoRecPtr urp in this structure, how
> about adding that to UnpackedUndoRecord? When inserting, caller
> leaves it unset and InsertPreparedUndo sets it; when retrieving,
> UndoFetchRecord or UndoRecordBulkFetch sets it. With those two
> changes, I think you can get rid of UndoRecInfo entirely and just
> return an array of UnpackedUndoRecords. This might also eliminate the
> need for an 'urp' member in PreparedUndoSpace.
>
> I'd probably use UREC_INFO_BLKPREV rather than UREC_INFO_BLOCKPREV for
> consistency.
>
> Similarly UndoFetchRecord and UndoRecordBulkFetch seems a bit
> inconsistent. Perhaps UndoBulkFetchRecord.
>
> In general, I find the code for updating transaction headers to be
> really hard to understand. I'm not sure exactly what can be done
> about that. Like, why is UndoRecordPrepareTransInfo unpacking undo?
> Why does it take two undo record pointers as arguments and how are
> they different?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-05-01 00:35:04 Re: Initializing LWLock Array from Server Code
Previous Message Robert Haas 2019-05-01 00:31:40 Re: message style