Re: HOT chain validation in verify_heapam()

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2022-09-26 20:05:46
Message-ID: CA+TgmobeX1ED0domeb2bvaAefhmq95t-nZma7HJjTbqZi2R=4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 24, 2022 at 8:45 AM Himanshu Upadhyaya
<upadhyaya(dot)himanshu(at)gmail(dot)com> wrote:
> Here our objective is to validate if both Predecessor's xmin and current Tuple's xmin are same then cmin of predecessor must be less than current Tuple's cmin. In case when both tuple xmin's are same then I think predecessor's t_cid will always hold combo CID.
> Then either one or both tuple will always have a combo CID and skipping this check based on "either tuple has a combo CID" will make this if condition to be evaluated to false ''.

Fair point. I think maybe we should just remove the CID validation
altogether. I thought initially that it would be possible to have a
newer update with a numerically lower combo CID, but after some
experimentation I don't see a way to do it. However, it also doesn't
seem very useful to me to check that the combo CIDs are in ascending
order. I mean, even if that's not supposed to happen and does anyway,
there aren't really any enduring consequences, because command IDs are
ignored completely outside of the transaction that performed the
operation originally. So even if the combo CIDs were set to completely
random values, is that really corruption? At most it messes things up
for the duration of one transaction. And if we just have plain CIDs
rather than combo CIDs, the same thing is true: they could be totally
messed up and it wouldn't really matter beyond the lifetime of that
one transaction.

Also, it would be a lot more tempting to check this if we could check
it in all cases, but we can't. If a tuple is inserted in transaction
T1 and ten updated twice in transaction T2, we'll have only one combo
CID and nothing to compare it against, nor any way to decode what CMIN
and CMAX it originally represented. And this is probably a pretty
common type of case.

>> + if (TransactionIdEquals(pred_xmin, curr_xmin) &&
>> + (TransactionIdEquals(curr_xmin, curr_xmax) ||
>> + !TransactionIdIsValid(curr_xmax)) && pred_cmin >= curr_cmin)
>>
>> I don't understand the reason for the middle part of this condition --
>> TransactionIdEquals(curr_xmin, curr_xmax) ||
>> !TransactionIdIsValid(curr_xmax). I suppose the comment is meant to
>> explain this, but I still don't get it. If a tuple with XMIN 12345
>> CMIN 2 is updated to produce a tuple with XMIN 12345 CMIN 1, that's
>> corruption, regardless of what the XMAX of the second tuple may happen
>> to be.
>
> tuple | t_xmin | t_xmax | t_cid | t_ctid | tuple_data_split | heap_tuple_infomask_flags
>
> -------+--------+--------+-------+--------+---------------------------------------------+------------------------------------------------------------------------------------------------------------------
> -------------
> 1 | 971 | 971 | 0 | (0,3) | {"\\x1774657374312020202020","\\x01000000"} | ("{HEAP_HASVARWIDTH,HEAP_COMBOCID,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_HOT_UPDATED}",{})
> 2 | 971 | 0 | 1 | (0,2) | {"\\x1774657374322020202020","\\x02000000"} | ("{HEAP_HASVARWIDTH,HEAP_XMAX_INVALID}",{})
> 3 | 971 | 971 | 1 | (0,4) | {"\\x1774657374322020202020","\\x01000000"} | ("{HEAP_HASVARWIDTH,HEAP_COMBOCID,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY
> _TUPLE}",{})
> 4 | 971 | 972 | 0 | (0,5) | {"\\x1774657374332020202020","\\x01000000"} | ("{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY_TUPLE}",{})
> 5 | 972 | 0 | 0 | (0,5) | {"\\x1774657374342020202020","\\x01000000"} | ("{HEAP_HASVARWIDTH,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE}",{})
>
> In the above case Tuple 1->3->4 is inserted and updated by xid 971 and tuple 4 is next update by xid 972, here t_cid of tuple 4 is 0 where as its predecessor's t_cid is 1, because in Tuple 4 t_cid is having command ID of deleting transaction(cmax), that is why we need to check xmax of the Tuple.
>
> Please correct me if I am missing anything here?

Hmm, I see, so basically you're trying to check whether the CID field
contains a CMIN as opposed to a CMAX. But I'm not sure this test is
entirely reliable, because heap_prepare/execute_freeze_tuple() can set
a tuple's xmax to InvalidTransactionId even after it's had some other
value, and that won't do anything to the contents of the CID field.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-09-26 20:11:13 Re: identifying the backend that owns a temporary schema
Previous Message Tom Lane 2022-09-26 19:50:09 Re: identifying the backend that owns a temporary schema