Re: POC: Cleaning up orphaned files using undo logs

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-07-19 11:54:08
Message-ID: CAJ3gD9cer=Frzff6B2nE1b2w9Orif+o90vKdBRsZUvz9f1YzwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 9 May 2019 at 12:04, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:

> Patches can be applied on top of undo branch [1] commit:
> (cb777466d008e656f03771cf16ec7ef9d6f2778b)
>
> [1] https://github.com/EnterpriseDB/zheap/tree/undo

Below are some review points for 0009-undo-page-consistency-checker.patch :

+ /* Calculate the size of the partial record. */
+ partial_rec_size = UndoRecordHeaderSize(phdr->uur_info) +
+ phdr->tuple_len + phdr->payload_len -
+ phdr->record_offset;

There is already an UndoPagePartialRecSize() function which calculates
the size of partial record, which seems to do the same as above. If
this is same, you can omit the above code, and instead down below
where you increment next_record, you can do "next_record +=
UndoPagePartialRecSize()".

Also, I see an extra sizeof(uint16) added in
UndoPagePartialRecSize(). Not sure which one is correct, and which one
wrong, unless I am wrong in assuming that the above calculation and
the function definition do the same thing.

------------------

+ * We just want to mask the cid in the undo record header. So
+ * only if the partial record in the current page include the undo
+ * record header then we need to mask the cid bytes in this page.
+ * Otherwise, directly jump to the next record.
Here, I think you mean : "So only if the partial record in the current
page includes the *cid* bytes", rather than "includes the undo record
header"
May be we can say :
We just want to mask the cid. So do the partial record masking only if
the current page includes the cid bytes from the partial record
header.

----------------

+ if (phdr->record_offset < (cid_offset + sizeof(CommandId)))
+ {
+ char *cid_data;
+ Size mask_size;
+
+ mask_size = Min(cid_offset - phdr->record_offset,
+ sizeof(CommandId));
+
+ cid_data = next_record + cid_offset - phdr->record_offset;
+ memset(&cid_data, MASK_MARKER, mask_size);
+
Here, if record_offset lies *between* cid start and cid end, then
cid_offset - phdr->record_offset will be negative, and so will be
mask_size. Probably abs() should do the work.

Also, an Assert(cid_data + mask_size <= page_end) would be nice. I
know cid position of a partial record cannot go beyond the page
boundary, but it's better to have this Assert to do sanity check.

+ * Process the undo record of the page and mask their cid filed.
filed => field

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-07-19 11:59:41 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Antonin Houska 2019-07-19 11:32:01 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)