Re: Minor bug affecting ON CONFLICT lock wait log messages

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Minor bug affecting ON CONFLICT lock wait log messages
Date: 2016-03-22 04:02:19
Message-ID: CAM3SWZTM8yBxtS9enktjVNeEQRqNSSjw7zXnP+y0dx5Q1z24iQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 21, 2016 at 3:38 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> Basically, unlike with the similar nbtinsert.c code, we're checking
>> someone else's tuple in the speculative insertion
>> check_exclusion_or_unique_constraint() case that was changed (or it's
>> an exclusion constraint, where even the check for our own tuple
>> happens only after insertion; no change there in any case). Whereas,
>> in the nbtinsert.c case that I incorrectly duplicated, we're
>> specifically indicating that we're waiting on *our own* already
>> physically inserted heap tuple, and say as much in the
>> XLTW_InsertIndex message that makes it into the log.
>
> I don't quite follow how we're indicating that we're waiting on our own
> already physically inserted heap tuple in the log?

Because it's XLTW_InsertIndex in
check_exclusion_or_unique_constraint() now, this is the message:

case XLTW_InsertIndex:
cxt = gettext_noop("while inserting index tuple (%u,%u) in
relation \"%s\"");
break;

I guess what's at issue is whether or not it's okay that the (heap)
tuple TID shown (by this message) when waiting in
check_exclusion_or_unique_constraint() during ON CONFLICT is *not* our
own -- rather, it's the other guy's for this new use of
XLTW_InsertIndex. Does the message itself convey a contrary meaning,
or is it ambiguous in a way that supports either interpretation (i.e.
does that ambiguity allow us to leave things as they are)?

Certainly, this is unlike the nbtinsert.c XLTW_InsertIndex case (the
only other instance of XLTW_InsertIndex), where we must have
physically inserted already, and report specifically on our own
physically inserted TID when this message is shown. Does that
inconsistency alone make this wrong?

I think that XLTW_InsertIndex is intended to mean our own TID, based
on .1) the only precedent we have, and .2) based on the fact that
there is a distinct XLTW_InsertIndexUnique case at all.

BTW, just so the difference is clear, I point out that
XLTW_InsertIndexUnique (which I now propose to change
check_exclusion_or_unique_constraint() to use) says:

case XLTW_InsertIndexUnique:
cxt = gettext_noop("while checking uniqueness of tuple (%u,%u) in
relation \"%s\"");
break;

> One of the reasons I figured the XLTW_InsertIndex message made sense in
> this case is that it'd provide a consistent result for users.
> Specifically, I don't think it makes sense to produce different messages
> based only on who got there first. I don't mind having a different
> message when there's something different happening (there's exclusion
> constraints involved, or you're building an index, etc).

I see your point. That's what I thought, initially.

>> It seems fine to characterize a wait here as "checking the uniqueness
>> of [somebody else's] tuple", even though technically we're checking
>> the would-be uniqueness were we to (speculatively, or actually)
>> insert. However, it does not seem fine to claim ctid_wait as a tuple
>> we ourselves inserted.
>
> I don't think either message really fits here, unfortunately. We're not
> actually checking the uniqueness of someone else's tuple here either,
> after all, we're waiting to see what happens with their tuple because
> ours won't be unique if it goes in with that other tuple in place, if
> I'm following along correctly.

Well, we're determining if there was a conflict or not, in the first
phase of speculative insertion. That means that we need to see if an
update is required (for example). But that needs to behave just like
checking the uniqueness of an existing thing. It's just that our own
not-yet-physically-inserted "conceptual" tuple needs to be the thing
that makes this existing tuple (from some other xact) non-unique.

If that sounds weird, consider that in the standard, non-speculative
exclusion constraint case, the physical insertion actually has already
occurred, but even still we actually are waiting on the other guy's
tuple/xact (and report the other guy's TID, not our own, unlike with
nbtinsert.c). Notice that we make sure that it's another XID towards
the top of the loop within check_exclusion_or_unique_constraint()).

And so, its message says:

case XLTW_RecheckExclusionConstr:
cxt = gettext_noop("while checking exclusion constraint on tuple
(%u,%u) in relation \"%s\"");
break;

Of course, that this appeared for ON CONFLICT DO NOTHING with a B-Tree
index in 9.5.1 was wrong, but only because it said "exclusion
constraint" rather than "unique index". That's about what
XLTW_InsertIndexUnique says already, which is why I now think we
should just use XLTW_InsertIndexUnique.

> One thing I can say is that the XLTW_InsertIndex at least matches the
> *action* we're taking, which is that we're trying to INSERT.

Right, but I don't think that XLTW_InsertIndexUnique specifically
implies that we're not inserting, just as XLTW_RecheckExclusionConstr
does not specifically imply that we're not inserting (actually, we're
usually or always inserting with XLTW_RecheckExclusionConstr, so it
better not).

> I don't feel terribly strongly about that position and so if others
> feel the XLTW_InsertIndexUnique message really would be better, I'd be
> happy to commit the change.

I'd also like to hear other opinions, if any are to be had. Sorry that
I changed my mind, but it's a subtle issue, I'm sure you'll agree. I'm
not going to push on this, but I want to be sure that we're happy with
this.

To reiterate, I think it boils down to: Is it okay that this new
XLTW_InsertIndex case reports someone else's TID, while the only other
XLTW_InsertIndex case has always reported our own TID?

Discussing these sorts of "ontological" questions reminds me just how
painful UPSERT was as a project. :-)

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-22 04:04:18 Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Previous Message Amit Kapila 2016-03-22 03:54:45 Re: OOM in libpq and infinite loop with getCopyStart()