Re: POC: Cleaning up orphaned files using undo logs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-07-10 19:08:33
Message-ID: CA+TgmoYaB+41gRp_bfbjyJ84JVkw9Lw7ZjJJqiEB8RK=VXZ=Gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 9, 2019 at 6:28 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> PFA, updated patch version which includes
> - One defect fix in undo interface related to undo page compression
> for handling persistence level
> - Implemented pending TODO optimization in undo page compression.
> - One defect fix in undo processing related to the prepared transaction

Looking at 0002 a bit, it seems to me that you really need to spend
some energy getting things into a consistent order all across the
patch. For example, UndoPackStage uses the ordering: HEADER,
TRANSACTION, RMID, RELOID, XID, CID... But the declarations of the
UREC_INFO constants go in a different order: TRANSACTION, FORK, BLOCK,
BLKPREV... The comments defining those go in a different order and
some of them are missing. The definition of the UndoRecordBlah
structures go in a different order still: Transaction, Block,
LogSwitch, Payload. UndoRecordHeaderSize goes with FORK, BLOCK,
BLPREV, TRANSACTION, LOGSWITCH, .... That really needs to be
straightened out and made consistent.

You (still) need to rename blkprev to something more generic, as
mentioned in previous rounds of review.

I think it would be a good idea to avoid complex macros in favor of
functions where possible, e.g. UNDO_PAGE_PARTIAL_REC_SIZE. If
performance is a concern, it could be declared static inline, which
should be as good as a macro.

I don't like the fact that undoaccess.c has a new global,
undo_compression_info. I haven't read the code thoroughly, but do we
really need that? I think it's never modified (so it could just be
declared const), and I also think it's just all zeroes (so
initializing it isn't really necessary), and I also think that it's
just used for initializing other UndoCompressionInfos (so we could
just initialize them directly, either by setting the members
individually or jus zeroing them).

It seems like UndoRecordPrepareTransInfo ought to have an Assert(index
< some_limit) in the loop.

A comment in PrepareUndoInsert refers to "low switch" where it means
"log switch."

This is by no means a complete review, for which I unfortunately lack
the time at present. Just some initial observations.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2019-07-10 19:12:46 Re: pg_receivewal documentation
Previous Message Stephen Frost 2019-07-10 18:57:54 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)