Re: HOT chain validation in verify_heapam()

From: Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2022-09-27 06:40:01
Message-ID: CAPF61jB8-LMWqD3eb34Y_SWpORfn59FT_gvDEdZ6Si2gPSDYDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 27, 2022 at 1:35 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Sat, Sep 24, 2022 at 8:45 AM Himanshu Upadhyaya
> <upadhyaya(dot)himanshu(at)gmail(dot)com> wrote:
> > Here our objective is to validate if both Predecessor's xmin and current
> Tuple's xmin are same then cmin of predecessor must be less than current
> Tuple's cmin. In case when both tuple xmin's are same then I think
> predecessor's t_cid will always hold combo CID.
> > Then either one or both tuple will always have a combo CID and skipping
> this check based on "either tuple has a combo CID" will make this if
> condition to be evaluated to false ''.
>
> Fair point. I think maybe we should just remove the CID validation
> altogether. I thought initially that it would be possible to have a
> newer update with a numerically lower combo CID, but after some
> experimentation I don't see a way to do it. However, it also doesn't
> seem very useful to me to check that the combo CIDs are in ascending
> order. I mean, even if that's not supposed to happen and does anyway,
> there aren't really any enduring consequences, because command IDs are
> ignored completely outside of the transaction that performed the
> operation originally. So even if the combo CIDs were set to completely
> random values, is that really corruption? At most it messes things up
> for the duration of one transaction. And if we just have plain CIDs
> rather than combo CIDs, the same thing is true: they could be totally
> messed up and it wouldn't really matter beyond the lifetime of that
> one transaction.
>
> Also, it would be a lot more tempting to check this if we could check
> it in all cases, but we can't. If a tuple is inserted in transaction
> T1 and ten updated twice in transaction T2, we'll have only one combo
> CID and nothing to compare it against, nor any way to decode what CMIN
> and CMAX it originally represented. And this is probably a pretty
> common type of case.
>
> ok, I will be removing this entire validation of cmin/cid in my next patch.

> >> + if (TransactionIdEquals(pred_xmin, curr_xmin) &&
> >> + (TransactionIdEquals(curr_xmin, curr_xmax) ||
> >> + !TransactionIdIsValid(curr_xmax)) && pred_cmin >=
> curr_cmin)
> >>
> >> I don't understand the reason for the middle part of this condition --
> >> TransactionIdEquals(curr_xmin, curr_xmax) ||
> >> !TransactionIdIsValid(curr_xmax). I suppose the comment is meant to
> >> explain this, but I still don't get it. If a tuple with XMIN 12345
> >> CMIN 2 is updated to produce a tuple with XMIN 12345 CMIN 1, that's
> >> corruption, regardless of what the XMAX of the second tuple may happen
> >> to be.
> >
> > tuple | t_xmin | t_xmax | t_cid | t_ctid |
> tuple_data_split |
> heap_tuple_infomask_flags
> >
> >
> -------+--------+--------+-------+--------+---------------------------------------------+------------------------------------------------------------------------------------------------------------------
> > -------------
> > 1 | 971 | 971 | 0 | (0,3) |
> {"\\x1774657374312020202020","\\x01000000"} |
> ("{HEAP_HASVARWIDTH,HEAP_COMBOCID,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_HOT_UPDATED}",{})
> > 2 | 971 | 0 | 1 | (0,2) |
> {"\\x1774657374322020202020","\\x02000000"} |
> ("{HEAP_HASVARWIDTH,HEAP_XMAX_INVALID}",{})
> > 3 | 971 | 971 | 1 | (0,4) |
> {"\\x1774657374322020202020","\\x01000000"} |
> ("{HEAP_HASVARWIDTH,HEAP_COMBOCID,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY
> > _TUPLE}",{})
> > 4 | 971 | 972 | 0 | (0,5) |
> {"\\x1774657374332020202020","\\x01000000"} |
> ("{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY_TUPLE}",{})
> > 5 | 972 | 0 | 0 | (0,5) |
> {"\\x1774657374342020202020","\\x01000000"} |
> ("{HEAP_HASVARWIDTH,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE}",{})
> >
> > In the above case Tuple 1->3->4 is inserted and updated by xid 971 and
> tuple 4 is next update by xid 972, here t_cid of tuple 4 is 0 where as its
> predecessor's t_cid is 1, because in Tuple 4 t_cid is having command ID of
> deleting transaction(cmax), that is why we need to check xmax of the Tuple.
> >
> > Please correct me if I am missing anything here?
>
> Hmm, I see, so basically you're trying to check whether the CID field
> contains a CMIN as opposed to a CMAX. But I'm not sure this test is
> entirely reliable, because heap_prepare/execute_freeze_tuple() can set
> a tuple's xmax to InvalidTransactionId even after it's had some other
> value, and that won't do anything to the contents of the CID field.
>

ok, Got it, as we are removing this cmin/cid validation so we don't need
any change here.

--
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rushabh Lathia 2022-09-27 06:52:50 Re: DROP OWNED BY is broken on master branch.
Previous Message Dilip Kumar 2022-09-27 06:33:18 Re: making relfilenodes 56 bits