Re: POC: Cleaning up orphaned files using undo logs

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-07-16 08:50:11
Message-ID: CAFiTN-v6Xj_PF+5k+XZOknvw4yxuJ8OkAS2wGcBtEt5rKkd0wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 11, 2019 at 9:17 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Jul 11, 2019 at 12:38 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Tue, Jul 9, 2019 at 6:28 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > PFA, updated patch version which includes
> > > - One defect fix in undo interface related to undo page compression
> > > for handling persistence level
> > > - Implemented pending TODO optimization in undo page compression.
> > > - One defect fix in undo processing related to the prepared transaction
> >
> > Looking at 0002 a bit, it seems to me that you really need to spend
> > some energy getting things into a consistent order all across the
> > patch. For example, UndoPackStage uses the ordering: HEADER,
> > TRANSACTION, RMID, RELOID, XID, CID... But the declarations of the
> > UREC_INFO constants go in a different order: TRANSACTION, FORK, BLOCK,
> > BLKPREV... The comments defining those go in a different order and
> > some of them are missing. The definition of the UndoRecordBlah
> > structures go in a different order still: Transaction, Block,
> > LogSwitch, Payload. UndoRecordHeaderSize goes with FORK, BLOCK,
> > BLPREV, TRANSACTION, LOGSWITCH, .... That really needs to be
> > straightened out and made consistent.

I have worked on this part, please check in the latest patch. For
some of the header i.e RMID, RELOID, XID, CID, FORK, PREVUNDO, and
BLOCK have an only one member so there are no structures for them
except this I think others are in order now.

>
> > You (still) need to rename blkprev to something more generic, as
> > mentioned in previous rounds of review.
>
> I will change this.
Changed to prevundo

> >
> > I think it would be a good idea to avoid complex macros in favor of
> > functions where possible, e.g. UNDO_PAGE_PARTIAL_REC_SIZE. If
> > performance is a concern, it could be declared static inline, which
> > should be as good as a macro.
> ok
Done
> >
> > I don't like the fact that undoaccess.c has a new global,
> > undo_compression_info. I haven't read the code thoroughly, but do we
> > really need that? I think it's never modified (so it could just be
> > declared const),
>
> Actually, this will get modified otherwise across undo record
> insertion how we will know what was the values of the common fields in
> the first record of the page. Another option could be that every time
> we insert the record, read the value from the first complete undo
> record on the page but that will be costly because for every new
> insertion we need to read the first undo record of the page.
>
> Currently, we are doing like this
>
> a) BeginUndoRecordInsert
> - Copy the global "undo_compression_info" to our local context for
> handling multi-prepare because for multi-prepare we don't want to
> update the global value until we have successfully inserted the undo
> record.
>
> b) PrepareUndoInsert
> -Operate on the context and update the context->undo_compression_info
> if required (page changed)
>
> c)InsertPrepareUndo
> - After we have inserted successfully overwrite
> context->undo_compression_info to the global "undo_compression_info".
> So that next undo insertion can get the right information.
>
> and I also think it's just all zeroes (so
> > initializing it isn't really necessary), and I also think that it's
> > just used for initializing other UndoCompressionInfos (so we could
> > just initialize them directly, either by setting the members
> > individually or jus zeroing them).
>
> Initially, I was doing that but later I thought that InvalidUndoRecPtr
> is macro (although the value is 0) shouldn't we initialize all
> UndoRecPtr variables with value InvalidUndoRecPtr instead of directly
> using 0 so I changed like this.
>
> >
> > It seems like UndoRecordPrepareTransInfo ought to have an Assert(index
> > < some_limit) in the loop.
Done
> >
> > A comment in PrepareUndoInsert refers to "low switch" where it means
> > "log switch."
>
> I will fix.
Fixed
> >
> > This is by no means a complete review, for which I unfortunately lack
> > the time at present. Just some initial observations.
> >
> ok

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
undolog_20190716.tar.gz application/x-gzip 159.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-07-16 08:52:34 Re: Check-out mutable functions in check constraints
Previous Message Tomas Vondra 2019-07-16 08:22:04 Re: SegFault on 9.6.14