Re: POC: Cleaning up orphaned files using undo logs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-08-20 14:27:44
Message-ID: CA+TgmoaH48Ty2c08XcX_tR9kqrMS2CKCzh6h_9eQ1mqx3-tEYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 19, 2019 at 2:04 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> Currently, In UnpackedUndoRecord we store all members directly which
> are set by the caller. We store pointers to some header which are
> allocated internally by the undo layer and the caller need not worry
> about setting them. So now you are suggesting to put other headers
> also as structures in UnpackedUndoRecord. I as such don't have much
> problem in doing that but I think initially Robert designed
> UnpackedUndoRecord structure this way so it will be good if Robert
> provides his opinion on this.

I don't believe that's what is being suggested. It seems to me that
the thing Andres is complaining about here has roots in the original
sketch that I did for this code. The oldest version I can find is
here:

https://github.com/EnterpriseDB/zheap/commit/7d194824a18f0c5e85c92451beab4bc6f044254c

In this version, and I think still in the current version, there is a
two-stage marshaling strategy. First, the individual fields from the
UnpackedUndoRecord get copied into global variables (yes, that was my
fault, too, at least in part!) which are structures. Then, the
structures get copied into the target buffer. The idea of that design
was to keep the code simple, but it didn't really work out, because
things got a lot more complicated between the time I wrote those 3244
lines of code and the >3000 lines of code that live in this patch
today. One thing that change was that we moved more and more in the
direction of considering individual fields as separate objects to be
separately included or excluded, whereas when I wrote that code I
thought we were going to have groups of related fields that stood or
fell together. That idea turned out to be wrong. (There is the
even-larger question here of whether we ought to take Heikki's
suggestion and make this whole thing a lot more generic, but let's
start by discussing how the design that we have today could be
better-implemented.)

If I understand Andres correctly, he's arguing that we ought to get
rid of the two-stage marshaling strategy. During decoding, he wants
data to go directly from the buffer that contains it to the
UnpackedUndoRecord without ever being stored in the UnpackUndoContext.
During insertion, he wants data to go directly from the
UnpackedUndoRecord to the buffer that contains it. Or, at least, if
there has to be an intermediate format, he wants it to be just a chunk
of raw bytes, rather than a bunch of individual fields like we have in
UndoPackContext currently. I think that's a reasonable goal. I'm not
as concerned about it as he is from a performance point of view, but I
think it would make the code look nicer, and that would be good. If
we save CPU cycles along the way, that is also good.

In broad outline, what this means is:

1. Any field in the UndoPackContext that starts with urec_ goes away.
2. Instead of something like InsertUndoBytes((char *)
&(ucontext->urec_fxid), ...) we'd write InsertUndobytes((char *)
&uur->uur_fxid, ...).
3. Similarly instead of ReadUndoBytes((char *) &ucontext->urec_fxid,
...) we'd write ReadUndoBytes((char *) &uur->uur_fxid, ...).
4. It seems slightly trickier to handle the cases where we've got a
structure instead of individual fields, like urec_hd. But those could
be broken down into field-by-field reads and writes, e.g. in this case
one call for urec_type and a second for urec_info.
5. For uur_group and uur_logswitch, the code would need to allocate
those subsidiary structures before copying into them.

To me, that seems like it ought to be a pretty straightforward change
that just makes things simpler. We'd probably need to pass the
UnpackedUndoRecord to BeginUnpackUndo instead of FinishUnpackUndo, and
keep a pointer to it in the UnpackUndoContext, but that seems fine.
FinishUnpackUndo would end up just about empty, maybe entirely empty.

Is that a reasonable idea?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-08-20 14:33:12 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Peter Eisentraut 2019-08-20 14:21:21 ICU for global collation