Re: Weird behaviour after update from 12.2 to 12.3 version

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Andrii Palko <palko1902(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Weird behaviour after update from 12.2 to 12.3 version
Date: 2020-08-17 16:55:09
Message-ID: CAKFQuwYuKKp5dMyZfy358pG-7ZivwCcuaYhL9x1ng0O+kdywEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Aug 17, 2020 at 9:26 AM Andrii Palko <palko1902(at)gmail(dot)com> wrote:

> Update to version 12.3 broke this query. In 12.2 it works as expected but
> in version 12.3 it works conversely. It updates the newer row by
> information from the old row.
>
> As I understand such breaking changes shouldn't be in the minor version
> update, so it looks strange for me, also I didn't find anything about it in
> releases notes.
>

A minor version update doesn't guarantee that a incorrectly written query
that happens to give the expected results will continue to do so if
expected results are not due to some guarantee.

Looking at the query I don't see that the WHERE clause is guaranteeing that
the product_attribute_options row being updated is the "old row".

What I believe is happening here is since both nextRow and
product_attribute_options contain two rows the UPDATE statement join
produces 4 different input tuple pair orderings:

(new, old), (new, new), (old, old), (old, new)

I'm beginning to disbelieve your claim that this works properly in
12.2...though I think it's because the UPDATE forces only two of the four
possibilities to ever be considered.

The (new, new) and (old, old) pairs don't matter but I would expect that
the net result of this would be to always swap your old and new details.
But even if the swapping doesn't occur whether (new, old) or (old, new) is
evaluated first undefined. Supposedly in 12.2 (old, new) was evaluated
while in 12.3 (new, old) is.

I'm sure I missing some detail here to make this a truly accurate
depiction of reality but in short your UPDATE join is non-deterministic and
you need to fix your query because non-determinstic results can change for
any number of reasons including minor release updates (though more commonly
plan shape due to changing statistics is the more common cause). I do not
believe the lone ORDER BY in the CTE is sufficient to impose determinism.

It looks as if you only need one row to be produced by the CTE so a LIMIT 1
there is warranted and probably sufficient. Adding an old.id <> new.id
would make the update more efficient since it wouldn't attempt to update
more than one row (assuming that only two rows are present in the table).

David J.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Cyril Jouve 2020-08-17 20:48:11 Re: BUG #16508: using multi-host connection string when the first host is starting fails
Previous Message James Lucas 2020-08-17 16:21:42 Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop