Re: Undo logs

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Dilip Kumar <dilip(dot)kumar(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Undo logs
Date: 2018-11-14 09:12:17
Message-ID: CAFiTN-v-uusoMw3b-qRLuegry8-ygPa5YkDeweL9kTZ57Fo68w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 10, 2018 at 9:27 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Nov 5, 2018 at 5:10 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> [review for undo record layer (0003-undo-interface-v3)]
>
> I might sound repeating myself, but just to be clear, I was involved
> in the design of this patch as well and I have given a few high-level
> inputs for this patch. I have used this interface in the zheap
> development, but haven't done any sort of detailed review which I am
> doing now. I encourage others also to review this patch.

Thanks for the review, please find my reply inline.
>
> 1.
> * NOTES:
> + * Handling multilog -
> + * It is possible that the undo record of a transaction can be spread across
> + * multiple undo log. And, we need some special handling while inserting the
> + * undo for discard and rollback to work sanely.
>
> I think before describing how the undo record is spread across
> multiple logs, you can explain how it is laid out when that is not the
> case. You can also explain how undo record headers are linked. I am
> not sure file header is the best place or it should be mentioned in
> README, but I think for now we can use file header for this purpose
> and later we can move it to README if required.
Added in the header.

>
> 2.
> +/*
> + * FIXME: Do we want to support undo tuple size which is more than the BLCKSZ
> + * if not than undo record can spread across 2 buffers at the max.
> + */
>
> +#define MAX_BUFFER_PER_UNDO 2
>
> I think here the right question is what is the possibility of undo
> record to be greater than BLCKSZ? For zheap, as of today, we don'
> have any such requirement as the largest undo record is written for
> update or multi_insert and in both cases we don't exceed the limit of
> BLCKSZ. I guess some user other than zheap could probably have such
> requirement and I don't think it is impossible to enhance this if we
> have any requirement.
>
> If anybody else has an opinion here, please feel to share it.

Should we remove this FIXME or lets wait for some other opinion. As
of now I have kept it as it is.
>
> 3.
> +/*
> + * FIXME: Do we want to support undo tuple size which is more than the BLCKSZ
> + * if not than undo record can spread across 2 buffers at the max.
> + */
> +#define MAX_BUFFER_PER_UNDO 2
> +
> +/*
> + * Consider buffers needed for updating previous transaction's
> + * starting undo record. Hence increased by 1.
> + */
> +#define MAX_UNDO_BUFFERS (MAX_PREPARED_UNDO + 1) * MAX_BUFFER_PER_UNDO
> +
> +/* Maximum number of undo record that can be prepared before calling insert. */
> +#define MAX_PREPARED_UNDO 2
>
> I think it is better to define MAX_PREPARED_UNDO before
> MAX_UNDO_BUFFERS as the first one is used in the definition of a
> second.

Done
>
> 4.
> +/*
> + * Call PrepareUndoInsert to tell the undo subsystem about the undo record you
> + * intended to insert. Upon return, the necessary undo buffers are pinned.
> + * This should be done before any critical section is established, since it
> + * can fail.
> + *
> + * If not in recovery, 'xid' should refer to the top transaction id because
> + * undo log only stores mapping for the top most transactions.
> + * If in recovery, 'xid' refers to the transaction id stored in WAL.
> + */
> +UndoRecPtr
> +PrepareUndoInsert(UnpackedUndoRecord *urec, UndoPersistence upersistence,
> + TransactionId xid)
>
> This function locks the buffer as well which is right as we need to do
> that before critical section, but the function header comments doesn't
> indicate it. You can modify it as:
> "Upon return, the necessary undo buffers are pinned and locked."
>
> Note that similar modification is required in .h file as well.

Done
>
> 5.
> +/*
> + * Insert a previously-prepared undo record. This will lock the buffers
> + * pinned in the previous step, write the actual undo record into them,
> + * and mark them dirty. For persistent undo, this step should be performed
> + * after entering a critical section; it should never fail.
> + */
> +void
> +InsertPreparedUndo(void)
>
> Here, the comments are wrong as buffers are already locked in the
> previous step. A similar change is required in .h file as well.

Fixed
>
> 6.
> +InsertPreparedUndo(void)
> {
> ..
> /*
> + * Try to insert the record into the current page. If it doesn't
> + * succeed then recall the routine with the next page.
> + */
> + if (InsertUndoRecord(uur, page, starting_byte, &already_written, false))
> + {
> + undo_len += already_written;
> + MarkBufferDirty(buffer);
> + break;
> + }
> +
> + MarkBufferDirty(buffer);
> ..
> }
>
> Here, you are writing into a shared buffer and marking it dirty, isn't
> it a good idea to Assert for being in the critical section?
Done
>
> 7.
> +/* Maximum number of undo record that can be prepared before calling insert. */
> +#define MAX_PREPARED_UNDO 2
>
> /record/records
>
> I think this definition doesn't define the maximum number of undo
> records that can be prepared as the caller can use UndoSetPrepareSize
> to change it. I think you can modify the comment as below or
> something on those lines:
> "This defines the number of undo records that can be prepared before
> calling insert by default. If you need to prepare more than
> MAX_PREPARED_UNDO undo records, then you must call UndoSetPrepareSize
> first."

Fixed
>
> 8.
> + * Caller should call this under log->discard_lock
> + */
> +static bool
> +IsPrevTxnUndoDiscarded(UndoLogControl *log, UndoRecPtr prev_xact_urp)
> +{
> + if (log->oldest_data == InvalidUndoRecPtr)
> ..
>
> Isn't it a good idea to have an Assert that we already have discard_lock?
Done

>
> 9.
> + UnpackedUndoRecord uur; /* prev txn's first undo record.*/
> +} PreviousTxnInfo;
>
> Extra space at the of comment is required.

Done
>
> 10.
> +/*
> + * Check if previous transactions undo is already discarded.
> + *
> + * Caller should call this under log->discard_lock
> + */
> +static bool
> +IsPrevTxnUndoDiscarded(UndoLogControl *log, UndoRecPtr prev_xact_urp)
> +{
>
> The name suggests that this function is doing something special for
> the previous transaction whereas this it just checks whether undo is
> discarded corresponding to a particular undo location. Isn't it
> better if we name it as UndoRecordExists or UndoRecordIsValid? Then
> explain in comments when do you consider particular record exists.
>
Changed to UndoRecordIsValid

> Another point to note is that you are not releasing the lock in all
> paths, so it is better to mention in comments when will it be released
> and when not.
Done

>
>
> 11.
> +IsPrevTxnUndoDiscarded(UndoLogControl *log, UndoRecPtr prev_xact_urp)
> +{
> + if (log->oldest_data == InvalidUndoRecPtr)
> + {
> + /*
> + * oldest_data is not yet initialized. We have to check
> + * UndoLogIsDiscarded and if it's already discarded then we have
> + * nothing to do.
> + */
> + LWLockRelease(&log->discard_lock);
> + if (UndoLogIsDiscarded(prev_xact_urp))
> + return true;
>
> The comment in above code is just trying to write the code in words.
> I think here you should tell why we need to call UndoLogIsDiscarded
> when oldest_data is not initialized and or the scenario when
> oldest_data will not be initialized.
Fixed
>
> 12.
> + * The actual update will be done by UndoRecordUpdateTransInfo under the
> + * critical section.
> + */
> +void
> +PrepareUndoRecordUpdateTransInfo(UndoRecPtr urecptr, bool log_switched)
> +{
> ..
>
> I find this function name bit awkward. How about UndoRecordPrepareTransInfo?

Changed as per the suggestion
>
> 13.
> +PrepareUndoRecordUpdateTransInfo(UndoRecPtr urecptr, bool log_switched)
> {
> ..
> + /*
> + * TODO: For now we don't know how to build a transaction chain for
> + * temporary undo logs. That's because this log might have been used by a
> + * different backend, and we can't access its buffers. What should happen
> + * is that the undo data should be automatically discarded when the other
> + * backend detaches, but that code doesn't exist yet and the undo worker
> + * can't do it either.
> + */
> + if (log->meta.persistence == UNDO_TEMP)
> + return;
>
> Aren't we already dealing with this case in the other patch [1]?
> Basically, I think we should discard it at commit time and or when the
> backend is detached.

Changed
>
> 14.
> +PrepareUndoRecordUpdateTransInfo(UndoRecPtr urecptr, bool log_switched)
> {
> ..
> /*
> + * If previous transaction's urp is not valid means this backend is
> + * preparing its first undo so fetch the information from the undo log
> + * if it's still invalid urp means this is the first undo record for this
> + * log and we have nothing to update.
> + */
> + if (!UndoRecPtrIsValid(prev_xact_urp))
> + return;
> ..
>
> This comment is confusing. It appears to be saying same thing twice.
> You can write it along something like:
>
> "The absence of previous transaction's undo indicate that this backend
> is preparing its first undo in which case we have nothing to update.".

Done as per the suggestion
>
> 15.
> +PrepareUndoRecordUpdateTransInfo(UndoRecPtr urecptr, bool log_switched)
> /*
> + * Acquire the discard lock before accessing the undo record so that
> + * discard worker doen't remove the record while we are in process of
> + * reading it.
> + */
>
> Typo doen't/doesn't.
>
> I think you can use 'can't' instead of doesn't.

Fixed
>
> This is by no means a complete review, rather just noticed a few
> things while reading the patch.
>
> [1] - https://www.postgresql.org/message-id/CAFiTN-t8fv-qYG9zynhS-1jRrvt_o5C-wCMRtzOsK8S%3DMXvKKw%40mail.gmail.com
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
0003-undo-interface-v4.patch application/octet-stream 68.5 KB
0004-undo-interface-test-v4.patch application/octet-stream 68.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-11-14 09:27:55 Re: Undo logs
Previous Message Thomas Munro 2018-11-14 08:15:21 Re: DSM segment handle generation in background workers