Re: [BUG?] check_exclusion_or_unique_constraint false negative

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: [BUG?] check_exclusion_or_unique_constraint false negative
Date: 2025-08-25 11:51:27
Message-ID: CAA4eK1+SCb3aiMdkznTo84Rw+t1824QETRM_J4rK=ddRsDvzhQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 25, 2025 at 4:19 PM Mihail Nikalayeu
<mihailnikalayeu(at)gmail(dot)com> wrote:
>
> > Why only by luck?
>
> I mean last_write_win provides the same results in the following cases:
> * we found the tuple, detected a conflict, and decided to ignore the
> update coming from the publisher
> * we were unable to find the tuple, logged an error about it, and
> ignored the update coming from the publisher
>
> In both cases, the result is the same: the subscriber version remains
> in the table.
>

Right, so we can say that it will be consistent.

> > Then these may not lead to eventual consistency for such cases. So,
> > not sure one should anyway rely on these.
>
> But with the fixed snapshot dirty scan, it becomes possible to
> implement such strategies.
> Also, some strategies require some kind of merge function for tuples.
> In my understanding, even last_write_win should probably compare
> timestamps to determine which version is "newer" because time in
> distributed systems can be tricky.
> Therefore, we have to find the tuple if it exists.
>
> > BTW, then isn't it possible that INSERT happens on a different page?
>
> Yes, it is possible - in that case, the bug does not occur. It only
> happens if a new TID of some logical tuple is added to the same page.
>

What if the new insert happens in a page prior to the current page? I
mean that the scan won't encounter the page where Insert happens.

> Just to clarify, this is about B-tree pages, not the heap.
>
> > I think this questions whether we consider the SnapshotDirty results
> > correct or not.
>
> In my understanding, this is clearly wrong:
> * such behavior is not documented anywhere
>

I agree. This is where we need inputs.

> * usage patterns assume that such things cannot happen
> * new features struggle with it. For example, the new update_deleted
> logging may fail to behave correctly
> (038_update_missing_with_retain.pl in the patch) - so how should it be
> used? It might be correct, but it also might not be...
>
> Another option is to document the behavior and rename it to SnapshotMaybe :)
> By the way, SnapshotSelf is also affected.
>

BTW, do we know the reason behind using SnapshotDirty in the first
place? I don't see any comments in the nearby code unless I am missing
something.

> > The case of logical replication giving wrong results
> > [0] is the behavior from the beginning of logical replication.
>
> Logical replication was mainly focused on replication without any
> concurrent updates on the subscriber side. So, I think this is why the
> issue was overlooked.
>

The other possibility is that as this is a rare scenario so we didn't
consider it.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2025-08-25 12:28:44 Re: pg_waldump: support decoding of WAL inside tarfile
Previous Message Damien Clochard 2025-08-25 11:42:35 Re: [PATCH] Generate random dates/times in a specified range