Re: HOT chain validation in verify_heapam()

From: Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2022-09-30 13:54:36
Message-ID: CAPF61jDNh3-pGpKuAJ5FN73A8oDoN8iOi3vGU_DY4RP9txdcqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 20, 2022 at 6:43 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Sep 20, 2022 at 5:00 AM Himanshu Upadhyaya
> <upadhyaya(dot)himanshu(at)gmail(dot)com> wrote:
> > Please find it attached.
>
> This patch still has no test cases. Just as we have test cases for the
> existing corruption checks, we should have test cases for these new
> corruption checks, showing cases where they actually fire.
>
> Test cases are now part of this v6 patch.

> I think I would be inclined to set lp_valid[x] = true in both the
> redirected and non-redirected case, and then have the very first thing
> that the second loop does be if (nextoffnum == 0 ||
> !lp_valid[ctx.offnum]) continue. I think that would be more clear
> about the intent to ignore line pointers that failed validation. Also,
> if you did it that way, then you could catch the case of a redirected
> line pointer pointing to another redirected line pointer, which is a
> corruption condition that the current code does not appear to check.
>
> Yes, it's a good idea to do this additional validation with a redirected
line pointer. Done.

> + /*
> + * Validation via the predecessor array. 1) If the
> predecessor's
> + * xmin is aborted or in progress, the current tuples xmin
> should
> + * be aborted or in progress respectively. Also both xmin's
> must
> + * be equal. 2) If the predecessor's xmin is not frozen, then
> + * current tuple's shouldn't be either. 3) If the
> predecessor's
> + * xmin is equal to the current tuple's xmin, the current
> tuple's
> + * cmin should be greater than the predecessor's cmin. 4) If
> the
> + * current tuple is not HOT then its predecessor's tuple must
> not
> + * be HEAP_HOT_UPDATED. 5) If the current tuple is HOT then
> its
> + * predecessor's tuple must be HEAP_HOT_UPDATED.
> + */
>
> This comment needs to be split up into pieces and the pieces need to
> be moved closer to the tests to which they correspond.
>
> Done.

> + psprintf("unfrozen tuple was
> updated to produce a tuple at offset %u which is not frozen",
>
Shouldn't this say "which is frozen"?
>
> Done.

> + * Not a corruption if current tuple is updated/deleted by a
> + * different transaction, means t_cid will point to cmax
> (that is
> + * command id of deleting transaction) and cid of predecessor
> not
> + * necessarily will be smaller than cid of current tuple.
> t_cid
>
> I think that the next person who reads this code is likely to
> understand that the CIDs of different transactions are numerically
> unrelated. What's less obvious is that if the XID is the same, the
> newer update must have a higher CID.
>
> + * can hold combo command id but we are not worrying here
> since
> + * combo command id of the next updated tuple (if present)
> must be
> + * greater than combo command id of the current tuple. So
> here we
> + * are not checking HEAP_COMBOCID flag and simply doing t_cid
> + * comparison.
>
> I disapprove of ignoring the HEAP_COMBOCID flag. Emitting a message
> claiming that the CID has a certain value when that's actually a combo
> CID is misleading, so at least a different message wording is needed
> in such cases. But it's also not clear to me that the newer update has
> to have a higher combo CID, because combo CIDs can be reused. If you
> have multiple cursors open in the same transaction, the updates can be
> interleaved, and it seems to me that it might be possible for an older
> CID to have created a certain combo CID after a newer CID, and then
> both cursors could update the same page in succession and end up with
> combo CIDs out of numerical order. Unless somebody can make a
> convincing argument that this isn't possible, I think we should just
> skip this check for cases where either tuple has a combo CID.
>
> + if (TransactionIdEquals(pred_xmin, curr_xmin) &&
> + (TransactionIdEquals(curr_xmin, curr_xmax) ||
> + !TransactionIdIsValid(curr_xmax)) && pred_cmin >=
> curr_cmin)
>
> I don't understand the reason for the middle part of this condition --
> TransactionIdEquals(curr_xmin, curr_xmax) ||
> !TransactionIdIsValid(curr_xmax). I suppose the comment is meant to
> explain this, but I still don't get it. If a tuple with XMIN 12345
> CMIN 2 is updated to produce a tuple with XMIN 12345 CMIN 1, that's
> corruption, regardless of what the XMAX of the second tuple may happen
> to be.
>
> As discussed in our last discussion, I am removing this check altogether.

> + if (HeapTupleHeaderIsHeapOnly(curr_htup) &&
> + !HeapTupleHeaderIsHotUpdated(pred_htup))
>
> + if (!HeapTupleHeaderIsHeapOnly(curr_htup) &&
> + HeapTupleHeaderIsHotUpdated(pred_htup))
>
> I think it would be slightly clearer to write these tests the other
> way around i.e. check the previous tuple's state first.
>
> Done.

> + if (!TransactionIdIsValid(curr_xmax) &&
> HeapTupleHeaderIsHotUpdated(tuphdr))
> + {
> + report_corruption(ctx,
> + psprintf("tuple has been updated, but xmax is
> 0"));
> + result = false;
> + }
>
> I guess this message needs to say "tuple has been HOT updated, but
> xmax is 0" or something like that.
>
> Done.

--
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v6-0001-Implement-HOT-chain-validation-in-verify_heapam.patch text/x-patch 18.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Japin Li 2022-09-30 13:59:55 Re: log_heap_visible(): remove unused parameter and update comment
Previous Message Drouvot, Bertrand 2022-09-30 12:24:47 Re: log_heap_visible(): remove unused parameter and update comment