Re: Assert() failures during RI checks

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Assert() failures during RI checks
Date: 2020-03-22 22:30:04
Message-ID: 4963.1584916204@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2020-03-22 15:00:51 -0400, Tom Lane wrote:
>> Another thing that is very peculiar in this area is that the initial
>> assertion in the second stanza allows the case of result == TM_Deleted.

> In this case, isn't it clearly required to accept TM_Deleted? The HTSU
> after l1: obviously can return that, no?

Well, that's the question. If it is required, then we need to fix the
code in that stanza to not be trying to chase a bogus next-tid link.

> I wonder if we shouldn't just change the crosscheck case to set
> something other than TM_Updated, as it's not really accurate to say the
> tuple was updated.

Yeah, I was wondering about giving that a new result code, too.
It would be a little bit invasive and not at all back-patchable,
but (say) TM_SerializationViolation seems like a cleaner output ---
and we could define it to not return any TID info, as TM_Delete
doesn't. But we also need a fix that *can* be back-patched.

>> 2. Does the following stanza actually need to allow for the case
>> of a deleted tuple as well as an updated one? If so, blindly
>> trying to follow the ctid link is not so cool.

> Which ctid link following are you referring to? I'm not sure I quite
> follow.

The stanza right after the crosscheck one has always assumed that
it is dealing with an outdated-by-update tuple, so that chasing up
to the next TID is a sane thing to do. Maybe that's been wrong
since day one, or maybe we made it wrong somewhere along the line,
but it seems clearly wrong now (even without accounting for the
serialization crosscheck case). I think it just accidentally
doesn't cause problems in non-assert builds, as long as nobody
is assuming too much about which TID or XID gets returned.

> But for the crosscheck case (most of?)
> those wouldn't ever reach the ctid equivalency check, because of
> + if (IsolationUsesXactSnapshot())
> + ereport(ERROR,
> + (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
> + errmsg("could not serialize access due to concurrent update")));
> like checks beforehand.

Yeah. For HEAD I'm imagining that callers would just throw
ERRCODE_T_R_SERIALIZATION_FAILURE unconditionally if they
get back TM_SerializationViolation.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Bychik 2020-03-22 22:32:07 Re: WAL usage calculation patch
Previous Message Thomas Munro 2020-03-22 22:23:32 Re: Index Skip Scan